ncl-talk 2010 archive: Re: the addfile command is very slow

From: Bjoern Maronga <maronga_at_nyahnyahspammersnyahnyah>
Date: Thu Oct 28 2010 - 07:20:33 MDT

Hi David,

I don't really think it's really worthwile spending too much time on this. I
won't need those extensive timeseries-data too often and I was mostly
wondering, why this problem occurs. At least we found the reason.

Thanks.

Regards,
Björn

> It is certainly possible to rewrite the file eliminating the unlimited
> dimension. However, this does require the file to be completely rewritten,
> which is of course quite slow for a file of that size. I could try it and
> see how long it would take using NCL for your sample file. There may be
> some other ways to speed things up that I can try. I will do some
> experimentation when I get a chance. -dave
>
> On Oct 22, 2010, at 3:26 AM, Bjoern Maronga wrote:
> > Hi David,
> >
> > thanks for the support!
> >
> > I understand now the problem. In fact, I have got an unlimited
> > dimension "time", which was 3600 timesteps and which comes along with 3D
> > spatial data. The "ncdump -c" command took about one minute, when I
> > tested it on the file. Despite the fact, that this is no NCL core issue,
> > but a NetCDF one, I find this "spreading" a little bit annoying. One
> > thought would be, if it might be possible to change the unlimited
> > dimension to a limited dimension. This is not possible during the
> > simulation, which produces the data, but maybe afterwards. For
> > post-processing the attribute to be unlimited is not necessary. But I'm
> > not that deep into NetCDF programming, so I'm neighter sure if this is
> > possible, nor if this would eleminate the spreading.
> >
> > Regards,
> > Björn
> >
> >> Hi Bjoern,
> >>
> >> Actually, the information that NCL gathers when it opens a NetCDF file
> >> is more like "ncdump -c" than "ncdump -h": NCL gathers and caches
> >> coordinate variable values along with attribute names and values,
> >> variable names and types, and dimension names and sizes. Normally the
> >> time required to retrieve coordinate values is miniscule. However, if
> >> the file has an unlimited dimension, the coordinate values for the
> >> unlimited dimension are spread across all records in the file. If the
> >> file is large, it can take some time to extract the single required
> >> coordinate value from each record. Of course, once read, this data
> >> remains available for the duration of the execution and helps subsequent
> >> operations to complete faster.
> >>
> >> So I am guessing that your files have an unlimited dimension. Is this
> >> correct? We are always interested in improving NCL's performance and, if
> >> possible, we would like to have a sample of the type of file you are
> >> experiencing difficulty with. There are, of course, other strategies we
> >> could employ, such as waiting until the first request for coordinate
> >> data along the record dimension to begin to cache the values. If you
> >> would like to send a sample, please upload it to: ftp.cgd.ucar.edu.
> >> login: anonymous
> >> password: your email address
> >>
> >> cd incoming
> >> put <filename>
> >>
> >> You will need to send us (offline) the name of the file, since 'ls' is
> >> not allowed on this ftp server. -dave
> >>
> >> On Oct 19, 2010, at 9:27 AM, Dennis Shea wrote:
> >>> Again, I'm sure a core developer will reply later today.
> >>>
> >>> re:
> >>> A "ncdump -h" command from
> >>> the same machine does not take more than a split second.
> >>>
> >>> ===
> >>> A netCDF file is written with all the dimension names, sizes,
> >>> variable names, meta data, etc at the very 'top of the file'.
> >>> The "ncdump -h" is just reading that information which is
> >>> trivial. That is what you see when you do the ncdump -h.
> >>> It 'ncdump -h' is not reading any values associated
> >>> with the variables.
> >>>
> >>> On 10/19/10 9:17 AM, Bjoern Maronga wrote:
> >>>> Thanks for the reply. I'm working on a supercomputer node, the data is
> >>>> located on its dataserver. So I am indeed on a multi-user server, but
> >>>> the problem does not arise when requesting the memory! It comes
> >>>> directly due to "addfile", which should not request any memory, right?
> >>>>
> >>>> I added the wallClockElapseTime commands to my script, but I got the
> >>>> weird message:
> >>>>
> >>>> (0) wallClockElapseTime: something wrong: no printed value
> >>>>
> >>>> This happens here also for data, which is added by addfile in a
> >>>> second. Seems not to be related to the problem.
> >>>>
> >>>> However, I added two systemfunc("date") commands before and after
> >>>> "addfile" as well as when using the pointers to load data into arrays.
> >>>> Some code snippets follow:
> >>>>
> >>>> wcStrt = systemfunc("date")
> >>>> print(wcStrt)
> >>>> cdf_file = addfile(full_filename,"r")
> >>>> wcStrt2 = systemfunc("date")
> >>>> print(wcStrt2)
> >>>> wallClockElapseTime(wcStrt, "addfile", 0)
> >>>>
> >>>> [...]
> >>>>
> >>>> wcStrt = systemfunc("date")
> >>>> print(wcStrt)
> >>>> field_ts = cdf_file->$struc_pars$(ts:te,0,ys:ye,xs:xe)
> >>>> printVarSummary(field_ts)
> >>>> wcStrt2 = systemfunc("date")
> >>>> print(wcStrt2)
> >>>> wallClockElapseTime(wcStrt, "load1", 0)
> >>>>
> >>>> delete(field_ts)
> >>>>
> >>>> wcStrt = systemfunc("date")
> >>>> print(wcStrt)
> >>>> field_ts = cdf_file->$struc_pars$(ts:te,1,ys:ye,xs:xe)
> >>>> printVarSummary(field_ts)
> >>>> wcStrt2 = systemfunc("date")
> >>>> print(wcStrt2)
> >>>> wallClockElapseTime(wcStrt, "load2", 0)
> >>>>
> >>>>
> >>>> The resulting messages are:
> >>>>
> >>>> (0) Tue Oct 19 16:55:01 CEST 2010
> >>>> (0) Tue Oct 19 16:56:36 CEST 2010
> >>>> (0) wallClockElapseTime: something wrong: no printed value
> >>>> (0) Tue Oct 19 16:56:36 CEST 2010
> >>>> Variable: field_ts
> >>>> Type: float
> >>>> Total Size: 47239200 bytes
> >>>> 11809800 values
> >>>> Number of Dimensions: 3
> >>>> Dimensions and sizes: [time | 1800] x [y | 81] x [x | 81]
> >>>> Coordinates:
> >>>> time: [1801..3600]
> >>>> y: [ 2.5..1202.5]
> >>>> x: [ 2.5..1202.5]
> >>>> Number Of Attributes: 3
> >>>> zu_3d : -2.5
> >>>> long_name : pt
> >>>> units : K
> >>>> (0) Tue Oct 19 16:57:53 CEST 2010
> >>>> (0) wallClockElapseTime: something wrong: no printed value
> >>>> (0) Tue Oct 19 16:58:45 CEST 2010
> >>>> Variable: field_ts
> >>>> Type: float
> >>>> Total Size: 47239200 bytes
> >>>> 11809800 values
> >>>> Number of Dimensions: 3
> >>>> Dimensions and sizes: [time | 1800] x [y | 81] x [x | 81]
> >>>> Coordinates:
> >>>> time: [1801..3600]
> >>>> y: [ 2.5..1202.5]
> >>>> x: [ 2.5..1202.5]
> >>>> Number Of Attributes: 3
> >>>> units : K
> >>>> long_name : pt
> >>>> zu_3d : 7.5
> >>>> (0) Tue Oct 19 16:58:50 CEST 2010
> >>>> (0) wallClockElapseTime: something wrong: no printed value
> >>>>
> >>>>
> >>>> In this case, "addfile" was comparably fast ~1.5min. Nevertheless,
> >>>> this is not an adaquate time period for reading the metadata. A
> >>>> "ncdump -h" command from the same machine does not take more than a
> >>>> split second. As you can also see, loading of the data is faster
> >>>> (1.17min and 5s).
> >>>>
> >>>> I talked to the supercomputer support, but they said, that there would
> >>>> be no file system problems at the moment. This is why I thought, it
> >>>> might be due to some NCL problem. But if the "addfile" command only
> >>>> reads metadata, I'm at a loss.
> >>>>
> >>>> Regards,
> >>>> Björn
> >>>>
> >>>>> I am sure one of the NCL core developers will respond
> >>>>> later today.
> >>>>>
> >>>>> --
> >>>>> I ran a quick and dirty test on a ** 17.9 GB ** netCDF file.
> >>>>> Test script attached. I ran this several times on a multi-user
> >>>>> system.
> >>>>>
> >>>>> %> uname -a
> >>>>> Linux tramhill.cgd.ucar.edu 2.6.18-194.11.4.el5 #1 SMP Tue Sep 21
> >>>>> 05:04:09 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
> >>>>>
> >>>>>
> >>>>>
> >>>>> The f = addfile("..", "r") was essentially 'instantaneous'.
> >>>>>
> >>>>> NOTE: *no* data is read by "addfile". This creates a
> >>>>> reference (pointer) to the file.
> >>>>>
> >>>>> The taux=f->TAUX ; (1,2400,3600) was 'instantaneous'
> >>>>>
> >>>>> The temp=f->TEMP ; (1,42,2400,3600) took 2, 2, 3, 3 seconds
> >>>>>
> >>>>> Several things could affect your data input:
> >>>>> (1)
> >>>>> Are you on a multi-user system? When NCL is allocating memory
> >>>>> for the input array, are other users also 'competing' for memory?
> >>>>>
> >>>>> (2)
> >>>>> Is the data file on a local file system or on a (say) nfs
> >>>>> mounted file system? The later case could affect the input
> >>>>> data stream significantly. Sometime ago I ran some tests
> >>>>> on an nfs mounted file system. About midnight, there was
> >>>>> no timing difference when importing data from a locally
> >>>>> mounted file and the nfs mounted file. However, in the
> >>>>> middle of the day, the timings were very different.
> >>>>>
> >>>>> ===
> >>>>> Ultimately, (almost) all tools (NCL, IDL, Matlab, NCO, CDO,...)
> >>>>> that read netCDF are using the standard Unidata software.
> >>>>>
> >>>>> Cheers!
> >>>>>
> >>>>> On 10/19/10 4:06 AM, Bjoern Maronga wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> I have a problem using the NCL command "addfile" for loading NetCDF
> >>>>>> files, that have a size of at about 4 gb (or larger). It takes about
> >>>>>> 5 minutes to perform this command (I checked the better performance
> >>>>>> for smaller datasets). In contrary, the following commands for
> >>>>>> loading data into arrays is fast.
> >>>>>>
> >>>>>> For me it looks like "addfile" actually loads the whole amount of
> >>>>>> data into the memory, even though only a small part will be loaded
> >>>>>> into arrays afterwards. From my point of view, addfile only needs to
> >>>>>> read the metadata, kind of a "ncdump" command, which takes for these
> >>>>>> files not more time than for smaller files. I am very surprised by
> >>>>>> this finding and I'm wondering, if this makes sense at all. I'm
> >>>>>> working with NCL for maybe two years now and never noticed this
> >>>>>> behavior before. Has anything changed with the addfile command?
> >>>>>> Currently, I am using version 5.2.0.
> >>>>>>
> >>>>>> Best regards,
> >>>>>> Björn Maronga

-- 
-------------------------------------------
Dipl.-Met. Bjoern Maronga
Institute for Meteorology and Climatology
Leibniz University of Hannover
Herrenhaeuser Str. 2
Room F221
30419 Hannover        Tel.  +49 511 7622680
Germany
Email: maronga@muk.uni-hannover.de
Web:   http://palm.muk.uni-hannover.de
-------------------------------------------
_______________________________________________
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
http://mailman.ucar.edu/mailman/listinfo/ncl-talk

Received on Thu Oct 28 07:20:56 2010

This archive was generated by hypermail 2.1.8 : Fri Oct 29 2010 - 13:09:32 MDT