Re: efficiency of addfiles

From: Dennis Shea <shea_at_nyahnyahspammersnyahnyah>
Date: Thu, 04 Sep 2008 07:22:27 -0600

Guten Tag,

Perhaps, one of the NCL core developers will
answer more completely.

It sounds like you are accessing a dataset via a network.

[1]
At NCAR, we have many file systems NFS cross mounted.
Depending upon the amount of 'traffic' on the network,
the actual wall clock times can easily vary by factors
of 2-4 during peak periods. For timing purposes, the
files were copied to local [ie, hard wired] disk. The
timing differences between local and network file
access can be significant.

Have you run the script at non-peak times?

[2]
NCL uses the Unidata C-interfaces to access the data.
Unidata has stated that the software is designed more
for robustness than efficiency. Accessing 'point'
data is not particularly efficient.

[3]
Perhaps, as a test, you could bring over several files
to a local disk and try and then compare with network
times for the same subset of files.

Regards
Dennis Shea

Marcus Letzel wrote:
> Dear All,
>
> the below snippets of an NCL script and its output show that it takes
> NCL quite some time to execute the "addfiles" command on a set of four
> NetCDF files of a total size of 12.5 GB. The bandwidth of file data
> access is limited by a 100Mbps network connection, and consistently
> the monitored data transfer during execution of the NCL script rates at
> 12 MB/s. To access 12.5GB @ 12MB/s requires about 1100s. Each of the
> following four commands requires about 600 seconds (cf. script output
> below):
> f = addfiles( files, "r" )
> u = f[:]->ui_yz( :, (/25,33,65/), (/47,135,164/), 0 )
> v = f[:]->vi_yz( :, (/25,33,65/), (/47,135,164/), 0 )
> w = f[:]->wi_yz( :, (/25,33,65/), (/47,135,164/), 0 )
>
> Question 1:
> What does "addfiles" do that requires so much time? Can I somehow avoid
> this?
>
> Question 2:
> I understand that reading a subarray of dimension 16000 x 3 x 3 of the
> files is not very CPU time-efficient method because it saves only less
> than 50% of the CPU time that would be required to read one full array
> (2.5 GB).
> Yet, does someone know how I could possibly improve my data input CPU
> time-efficiency here?
>
> Kind wishes,
> Marcus
>
>
> NCL code snippet
> ================
> begin
>
> [...]
>
> wc0 = systemfunc("date")
>
> ;
> ; open input file(s)
> f = addfiles( files, "r" )
> ;
> ; data input
> wallClockElapseTime(wc0,"initialization",0)
> wc1 = systemfunc("date")
>
> printFileVarSummary( f[0], "ui_yz" )
> wallClockElapseTime(wc1,"FileVarSummary(ui_yz)",0)
> wc2 = systemfunc("date")
>
> u = f[:]->ui_yz( :, (/25,33,65/), (/47,135,164/), 0 )
> wallClockElapseTime(wc2,"read u",0)
> printVarSummary( u )
> wc3 = systemfunc("date")
>
> v = f[:]->vi_yz( :, (/25,33,65/), (/47,135,164/), 0 )
> wallClockElapseTime(wc3,"read v",0)
> wc4 = systemfunc("date")
>
> w = f[:]->wi_yz( :, (/25,33,65/), (/47,135,164/), 0 )
> wallClockElapseTime(wc4,"read w",0)
> wc5 = systemfunc("date")
>
> z = f[0]->zu
> t = f[:]->time
> wallClockElapseTime(wc5,"read z,time",0)
> wc6 = systemfunc("date")
>
> [...]
>
> end
>
>
> NCL output snippet
> ==================
>
> [...]
>
> (0)
> =====> Wall Clock Elapsed Time: initialization: 565 seconds <=====
>
>
>
> Variable: ui_yz
> Type: float
> Total Size: 11925792 bytes
> 2981448 values
> Number of Dimensions: 4
> Dimensions and sizes: [time | 72] x [zu | 129] x [y | 321] x [x_yz | 1]
> Coordinates:
> time: [48005.23199999911..48431.59199999907]
> zu: [ -1.. 255]
> y: [ 0.. 640]
> x_yz: [ 48.. 48]
> Number of Attributes: 2
> long_name : ui_yz
> units : m/s
>
> (0)
> =====> Wall Clock Elapsed Time: FileVarSummary(ui_yz): 0 seconds <=====
>
> (0)
> =====> Wall Clock Elapsed Time: read u: 621 seconds <=====
>
>
>
> Variable: u
> Type: float
> Total Size: 576000 bytes
> 144000 values
> Number of Dimensions: 3
> Dimensions and sizes: [16000] x [3] x [3]
> Coordinates:
> (0)
> =====> Wall Clock Elapsed Time: read v: 627 seconds <=====
>
> (0)
> =====> Wall Clock Elapsed Time: read w: 629 seconds <=====
>
>
> [...]
>
> _______________________________________________
> ncl-talk mailing list
> ncl-talk_at_ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
_______________________________________________
ncl-talk mailing list
ncl-talk_at_ucar.edu
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Thu Sep 04 2008 - 07:22:27 MDT

This archive was generated by hypermail 2.2.0 : Mon Sep 08 2008 - 14:49:02 MDT