Re: the addfile command is very slow

From: David Brown <dbrown_at_nyahnyahspammersnyahnyah>
Date: Fri Oct 22 2010 - 10:58:19 MDT

It is certainly possible to rewrite the file eliminating the unlimited dimension. However, this does require the file to be completely rewritten, which is of course quite slow for a file of that size. I could try it and see how long it would take using NCL for your sample file. There may be some other ways to speed things up that I can try. I will do some experimentation when I get a chance.
 -dave

On Oct 22, 2010, at 3:26 AM, Bjoern Maronga wrote:

> Hi David,
>
> thanks for the support!
>
> I understand now the problem. In fact, I have got an unlimited
> dimension "time", which was 3600 timesteps and which comes along with 3D
> spatial data. The "ncdump -c" command took about one minute, when I tested it
> on the file. Despite the fact, that this is no NCL core issue, but a NetCDF
> one, I find this "spreading" a little bit annoying. One thought would be, if
> it might be possible to change the unlimited dimension to a limited
> dimension. This is not possible during the simulation, which produces the
> data, but maybe afterwards. For post-processing the attribute to be unlimited
> is not necessary. But I'm not that deep into NetCDF programming, so I'm
> neighter sure if this is possible, nor if this would eleminate the spreading.
>
> Regards,
> Björn
>
>> Hi Bjoern,
>>
>> Actually, the information that NCL gathers when it opens a NetCDF file is
>> more like "ncdump -c" than "ncdump -h": NCL gathers and caches coordinate
>> variable values along with attribute names and values, variable names and
>> types, and dimension names and sizes. Normally the time required to
>> retrieve coordinate values is miniscule. However, if the file has an
>> unlimited dimension, the coordinate values for the unlimited dimension are
>> spread across all records in the file. If the file is large, it can take
>> some time to extract the single required coordinate value from each record.
>> Of course, once read, this data remains available for the duration of the
>> execution and helps subsequent operations to complete faster.
>>
>> So I am guessing that your files have an unlimited dimension. Is this
>> correct? We are always interested in improving NCL's performance and, if
>> possible, we would like to have a sample of the type of file you are
>> experiencing difficulty with. There are, of course, other strategies we
>> could employ, such as waiting until the first request for coordinate data
>> along the record dimension to begin to cache the values. If you would like
>> to send a sample, please upload it to: ftp.cgd.ucar.edu.
>> login: anonymous
>> password: your email address
>>
>> cd incoming
>> put <filename>
>>
>> You will need to send us (offline) the name of the file, since 'ls' is not
>> allowed on this ftp server. -dave
>>
>> On Oct 19, 2010, at 9:27 AM, Dennis Shea wrote:
>>> Again, I'm sure a core developer will reply later today.
>>>
>>> re:
>>> A "ncdump -h" command from
>>> the same machine does not take more than a split second.
>>>
>>> ===
>>> A netCDF file is written with all the dimension names, sizes,
>>> variable names, meta data, etc at the very 'top of the file'.
>>> The "ncdump -h" is just reading that information which is
>>> trivial. That is what you see when you do the ncdump -h.
>>> It 'ncdump -h' is not reading any values associated
>>> with the variables.
>>>
>>> On 10/19/10 9:17 AM, Bjoern Maronga wrote:
>>>> Thanks for the reply. I'm working on a supercomputer node, the data is
>>>> located on its dataserver. So I am indeed on a multi-user server, but
>>>> the problem does not arise when requesting the memory! It comes directly
>>>> due to "addfile", which should not request any memory, right?
>>>>
>>>> I added the wallClockElapseTime commands to my script, but I got the
>>>> weird message:
>>>>
>>>> (0) wallClockElapseTime: something wrong: no printed value
>>>>
>>>> This happens here also for data, which is added by addfile in a second.
>>>> Seems not to be related to the problem.
>>>>
>>>> However, I added two systemfunc("date") commands before and after
>>>> "addfile" as well as when using the pointers to load data into arrays.
>>>> Some code snippets follow:
>>>>
>>>> wcStrt = systemfunc("date")
>>>> print(wcStrt)
>>>> cdf_file = addfile(full_filename,"r")
>>>> wcStrt2 = systemfunc("date")
>>>> print(wcStrt2)
>>>> wallClockElapseTime(wcStrt, "addfile", 0)
>>>>
>>>> [...]
>>>>
>>>> wcStrt = systemfunc("date")
>>>> print(wcStrt)
>>>> field_ts = cdf_file->$struc_pars$(ts:te,0,ys:ye,xs:xe)
>>>> printVarSummary(field_ts)
>>>> wcStrt2 = systemfunc("date")
>>>> print(wcStrt2)
>>>> wallClockElapseTime(wcStrt, "load1", 0)
>>>>
>>>> delete(field_ts)
>>>>
>>>> wcStrt = systemfunc("date")
>>>> print(wcStrt)
>>>> field_ts = cdf_file->$struc_pars$(ts:te,1,ys:ye,xs:xe)
>>>> printVarSummary(field_ts)
>>>> wcStrt2 = systemfunc("date")
>>>> print(wcStrt2)
>>>> wallClockElapseTime(wcStrt, "load2", 0)
>>>>
>>>>
>>>> The resulting messages are:
>>>>
>>>> (0) Tue Oct 19 16:55:01 CEST 2010
>>>> (0) Tue Oct 19 16:56:36 CEST 2010
>>>> (0) wallClockElapseTime: something wrong: no printed value
>>>> (0) Tue Oct 19 16:56:36 CEST 2010
>>>> Variable: field_ts
>>>> Type: float
>>>> Total Size: 47239200 bytes
>>>> 11809800 values
>>>> Number of Dimensions: 3
>>>> Dimensions and sizes: [time | 1800] x [y | 81] x [x | 81]
>>>> Coordinates:
>>>> time: [1801..3600]
>>>> y: [ 2.5..1202.5]
>>>> x: [ 2.5..1202.5]
>>>> Number Of Attributes: 3
>>>> zu_3d : -2.5
>>>> long_name : pt
>>>> units : K
>>>> (0) Tue Oct 19 16:57:53 CEST 2010
>>>> (0) wallClockElapseTime: something wrong: no printed value
>>>> (0) Tue Oct 19 16:58:45 CEST 2010
>>>> Variable: field_ts
>>>> Type: float
>>>> Total Size: 47239200 bytes
>>>> 11809800 values
>>>> Number of Dimensions: 3
>>>> Dimensions and sizes: [time | 1800] x [y | 81] x [x | 81]
>>>> Coordinates:
>>>> time: [1801..3600]
>>>> y: [ 2.5..1202.5]
>>>> x: [ 2.5..1202.5]
>>>> Number Of Attributes: 3
>>>> units : K
>>>> long_name : pt
>>>> zu_3d : 7.5
>>>> (0) Tue Oct 19 16:58:50 CEST 2010
>>>> (0) wallClockElapseTime: something wrong: no printed value
>>>>
>>>>
>>>> In this case, "addfile" was comparably fast ~1.5min. Nevertheless, this
>>>> is not an adaquate time period for reading the metadata. A "ncdump -h"
>>>> command from the same machine does not take more than a split second. As
>>>> you can also see, loading of the data is faster (1.17min and 5s).
>>>>
>>>> I talked to the supercomputer support, but they said, that there would
>>>> be no file system problems at the moment. This is why I thought, it
>>>> might be due to some NCL problem. But if the "addfile" command only
>>>> reads metadata, I'm at a loss.
>>>>
>>>> Regards,
>>>> Björn
>>>>
>>>>> I am sure one of the NCL core developers will respond
>>>>> later today.
>>>>>
>>>>> --
>>>>> I ran a quick and dirty test on a ** 17.9 GB ** netCDF file.
>>>>> Test script attached. I ran this several times on a multi-user
>>>>> system.
>>>>>
>>>>> %> uname -a
>>>>> Linux tramhill.cgd.ucar.edu 2.6.18-194.11.4.el5 #1 SMP Tue Sep 21
>>>>> 05:04:09 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
>>>>>
>>>>>
>>>>>
>>>>> The f = addfile("..", "r") was essentially 'instantaneous'.
>>>>>
>>>>> NOTE: *no* data is read by "addfile". This creates a
>>>>> reference (pointer) to the file.
>>>>>
>>>>> The taux=f->TAUX ; (1,2400,3600) was 'instantaneous'
>>>>>
>>>>> The temp=f->TEMP ; (1,42,2400,3600) took 2, 2, 3, 3 seconds
>>>>>
>>>>> Several things could affect your data input:
>>>>> (1)
>>>>> Are you on a multi-user system? When NCL is allocating memory
>>>>> for the input array, are other users also 'competing' for memory?
>>>>>
>>>>> (2)
>>>>> Is the data file on a local file system or on a (say) nfs
>>>>> mounted file system? The later case could affect the input
>>>>> data stream significantly. Sometime ago I ran some tests
>>>>> on an nfs mounted file system. About midnight, there was
>>>>> no timing difference when importing data from a locally
>>>>> mounted file and the nfs mounted file. However, in the
>>>>> middle of the day, the timings were very different.
>>>>>
>>>>> ===
>>>>> Ultimately, (almost) all tools (NCL, IDL, Matlab, NCO, CDO,...)
>>>>> that read netCDF are using the standard Unidata software.
>>>>>
>>>>> Cheers!
>>>>>
>>>>> On 10/19/10 4:06 AM, Bjoern Maronga wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I have a problem using the NCL command "addfile" for loading NetCDF
>>>>>> files, that have a size of at about 4 gb (or larger). It takes about 5
>>>>>> minutes to perform this command (I checked the better performance for
>>>>>> smaller datasets). In contrary, the following commands for loading
>>>>>> data into arrays is fast.
>>>>>>
>>>>>> For me it looks like "addfile" actually loads the whole amount of data
>>>>>> into the memory, even though only a small part will be loaded into
>>>>>> arrays afterwards. From my point of view, addfile only needs to read
>>>>>> the metadata, kind of a "ncdump" command, which takes for these files
>>>>>> not more time than for smaller files. I am very surprised by this
>>>>>> finding and I'm wondering, if this makes sense at all. I'm working
>>>>>> with NCL for maybe two years now and never noticed this behavior
>>>>>> before. Has anything changed with the addfile command? Currently, I am
>>>>>> using version 5.2.0.
>>>>>>
>>>>>> Best regards,
>>>>>> Björn Maronga
>

_______________________________________________
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Fri Oct 22 10:58:26 2010

This archive was generated by hypermail 2.1.8 : Fri Oct 22 2010 - 12:21:46 MDT