Re: the addfile command is very slow

From: David Brown <dbrown_at_nyahnyahspammersnyahnyah>
Date: Tue Oct 19 2010 - 15:03:07 MDT

Hi Bjoern,

Actually, the information that NCL gathers when it opens a NetCDF file is more like "ncdump -c" than "ncdump -h": NCL gathers and caches coordinate variable values along with attribute names and values, variable names and types, and dimension names and sizes. Normally the time required to retrieve coordinate values is miniscule. However, if the file has an unlimited dimension, the coordinate values for the unlimited dimension are spread across all records in the file. If the file is large, it can take some time to extract the single required coordinate value from each record. Of course, once read, this data remains available for the duration of the execution and helps subsequent operations to complete faster.

So I am guessing that your files have an unlimited dimension. Is this correct? We are always interested in improving NCL's performance and, if possible, we would like to have a sample of the type of file you are experiencing difficulty with.
There are, of course, other strategies we could employ, such as waiting until the first request for coordinate data along the record dimension to begin to cache the values. If you would like to send a sample, please upload it to:
ftp.cgd.ucar.edu.
login: anonymous
password: your email address

cd incoming
put <filename>

You will need to send us (offline) the name of the file, since 'ls' is not allowed on this ftp server.
 -dave

On Oct 19, 2010, at 9:27 AM, Dennis Shea wrote:

> Again, I'm sure a core developer will reply later today.
>
> re:
> A "ncdump -h" command from
> the same machine does not take more than a split second.
>
> ===
> A netCDF file is written with all the dimension names, sizes,
> variable names, meta data, etc at the very 'top of the file'.
> The "ncdump -h" is just reading that information which is
> trivial. That is what you see when you do the ncdump -h.
> It 'ncdump -h' is not reading any values associated
> with the variables.
>
>
> On 10/19/10 9:17 AM, Bjoern Maronga wrote:
>> Thanks for the reply. I'm working on a supercomputer node, the data is located
>> on its dataserver. So I am indeed on a multi-user server, but the problem
>> does not arise when requesting the memory! It comes directly due
>> to "addfile", which should not request any memory, right?
>>
>> I added the wallClockElapseTime commands to my script, but I got the weird
>> message:
>>
>> (0) wallClockElapseTime: something wrong: no printed value
>>
>> This happens here also for data, which is added by addfile in a second. Seems
>> not to be related to the problem.
>>
>> However, I added two systemfunc("date") commands before and after "addfile" as
>> well as when using the pointers to load data into arrays. Some code snippets
>> follow:
>>
>> wcStrt = systemfunc("date")
>> print(wcStrt)
>> cdf_file = addfile(full_filename,"r")
>> wcStrt2 = systemfunc("date")
>> print(wcStrt2)
>> wallClockElapseTime(wcStrt, "addfile", 0)
>>
>> [...]
>>
>> wcStrt = systemfunc("date")
>> print(wcStrt)
>> field_ts = cdf_file->$struc_pars$(ts:te,0,ys:ye,xs:xe)
>> printVarSummary(field_ts)
>> wcStrt2 = systemfunc("date")
>> print(wcStrt2)
>> wallClockElapseTime(wcStrt, "load1", 0)
>>
>> delete(field_ts)
>>
>> wcStrt = systemfunc("date")
>> print(wcStrt)
>> field_ts = cdf_file->$struc_pars$(ts:te,1,ys:ye,xs:xe)
>> printVarSummary(field_ts)
>> wcStrt2 = systemfunc("date")
>> print(wcStrt2)
>> wallClockElapseTime(wcStrt, "load2", 0)
>>
>>
>> The resulting messages are:
>>
>> (0) Tue Oct 19 16:55:01 CEST 2010
>> (0) Tue Oct 19 16:56:36 CEST 2010
>> (0) wallClockElapseTime: something wrong: no printed value
>> (0) Tue Oct 19 16:56:36 CEST 2010
>> Variable: field_ts
>> Type: float
>> Total Size: 47239200 bytes
>> 11809800 values
>> Number of Dimensions: 3
>> Dimensions and sizes: [time | 1800] x [y | 81] x [x | 81]
>> Coordinates:
>> time: [1801..3600]
>> y: [ 2.5..1202.5]
>> x: [ 2.5..1202.5]
>> Number Of Attributes: 3
>> zu_3d : -2.5
>> long_name : pt
>> units : K
>> (0) Tue Oct 19 16:57:53 CEST 2010
>> (0) wallClockElapseTime: something wrong: no printed value
>> (0) Tue Oct 19 16:58:45 CEST 2010
>> Variable: field_ts
>> Type: float
>> Total Size: 47239200 bytes
>> 11809800 values
>> Number of Dimensions: 3
>> Dimensions and sizes: [time | 1800] x [y | 81] x [x | 81]
>> Coordinates:
>> time: [1801..3600]
>> y: [ 2.5..1202.5]
>> x: [ 2.5..1202.5]
>> Number Of Attributes: 3
>> units : K
>> long_name : pt
>> zu_3d : 7.5
>> (0) Tue Oct 19 16:58:50 CEST 2010
>> (0) wallClockElapseTime: something wrong: no printed value
>>
>>
>> In this case, "addfile" was comparably fast ~1.5min. Nevertheless, this is not
>> an adaquate time period for reading the metadata. A "ncdump -h" command from
>> the same machine does not take more than a split second. As you can also see,
>> loading of the data is faster (1.17min and 5s).
>>
>> I talked to the supercomputer support, but they said, that there would be no
>> file system problems at the moment. This is why I thought, it might be due to
>> some NCL problem. But if the "addfile" command only reads metadata, I'm at a
>> loss.
>>
>> Regards,
>> Björn
>>
>>
>>> I am sure one of the NCL core developers will respond
>>> later today.
>>>
>>> --
>>> I ran a quick and dirty test on a ** 17.9 GB ** netCDF file.
>>> Test script attached. I ran this several times on a multi-user
>>> system.
>>>
>>> %> uname -a
>>> Linux tramhill.cgd.ucar.edu 2.6.18-194.11.4.el5 #1 SMP Tue Sep 21
>>> 05:04:09 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>>
>>>
>>> The f = addfile("..", "r") was essentially 'instantaneous'.
>>>
>>> NOTE: *no* data is read by "addfile". This creates a
>>> reference (pointer) to the file.
>>>
>>> The taux=f->TAUX ; (1,2400,3600) was 'instantaneous'
>>>
>>> The temp=f->TEMP ; (1,42,2400,3600) took 2, 2, 3, 3 seconds
>>>
>>> Several things could affect your data input:
>>> (1)
>>> Are you on a multi-user system? When NCL is allocating memory
>>> for the input array, are other users also 'competing' for memory?
>>>
>>> (2)
>>> Is the data file on a local file system or on a (say) nfs
>>> mounted file system? The later case could affect the input
>>> data stream significantly. Sometime ago I ran some tests
>>> on an nfs mounted file system. About midnight, there was
>>> no timing difference when importing data from a locally
>>> mounted file and the nfs mounted file. However, in the
>>> middle of the day, the timings were very different.
>>>
>>> ===
>>> Ultimately, (almost) all tools (NCL, IDL, Matlab, NCO, CDO,...)
>>> that read netCDF are using the standard Unidata software.
>>>
>>> Cheers!
>>>
>>> On 10/19/10 4:06 AM, Bjoern Maronga wrote:
>>>> Hello,
>>>>
>>>> I have a problem using the NCL command "addfile" for loading NetCDF
>>>> files, that have a size of at about 4 gb (or larger). It takes about 5
>>>> minutes to perform this command (I checked the better performance for
>>>> smaller datasets). In contrary, the following commands for loading data
>>>> into arrays is fast.
>>>>
>>>> For me it looks like "addfile" actually loads the whole amount of data
>>>> into the memory, even though only a small part will be loaded into arrays
>>>> afterwards. From my point of view, addfile only needs to read the
>>>> metadata, kind of a "ncdump" command, which takes for these files not
>>>> more time than for smaller files. I am very surprised by this finding and
>>>> I'm wondering, if this makes sense at all. I'm working with NCL for maybe
>>>> two years now and never noticed this behavior before. Has anything
>>>> changed with the addfile command? Currently, I am using version 5.2.0.
>>>>
>>>> Best regards,
>>>> Björn Maronga
>>
>>
>>
> _______________________________________________
> ncl-talk mailing list
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk

_______________________________________________
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Tue Oct 19 15:03:15 2010

This archive was generated by hypermail 2.1.8 : Fri Oct 22 2010 - 12:21:46 MDT