Re: most efficient way to manipulate a large data set with NCL

From: Serra, Yolande L - (serra) <serra_at_nyahnyahspammersnyahnyah>
Date: Wed Sep 12 2012 - 17:10:26 MDT

Yes, Dennis, thank you. I am wondering which of these options is the most efficient computationally in NCL. Guess I'll just have to test each of them. I have a several seasons to process so was hoping to avoid the empirical approach ;)

Yolande Serra
serra@email.arizona.edu<mailto:serra@email.arizona.edu>

On Sep 12, 2012, at 1:52 PM, Dennis Shea wrote:

[1]

Not sure of the details of your files. Do they already
have a '.grb' extension? If not, you can manually add it to the script
as is done below. Otherwise, remove the references to ".grb" below

[2]

Option 1: Something like the following. The 'setfileoption'
         forces a time dimension if there is none.

 diri = ".../"
 fRoot = "FOO..." ; grib file root
 setfileoption("grb","SingleElementDimensions","Initial_time")

      fili = systemfunc("cd "+diri+" ; ls "+fRoot+"*")
      print(fili)

      f = addfiles( diri+fili+".grb", "r")

      x = f[:]->????? ; use default 'cat' mode
      printVarSummary(x)
-----------

Option 2: don't know if what your student says is true.

-----------
Option 3: Create a netCDF file of each grib.
Maybe you need v6.1.0-beta for this? Not sure.

http://www.ncl.ucar.edu/Document/Tools/ncl_convert2nc.shtml

[a]
ncl_convert2nc FOO* -e grb -itime -u initial_time0_hours -U time

[b]
ncrcat FOO*nc cat.FOO.nc

Then work on the one file.

On 9/12/12 1:42 PM, Serra, Yolande L - (serra) wrote:
Hello -

I'm not a "big" NCL user and am in need of some advice on reading a
large data set. What is the most efficient way to read in a 4-D array,
manipulate the data in the time and level dimensions and re-write to a
new file? The catch is that the variables are stored as 3-D, with each
time step being stored as a separate grib file at the moment.

So far as I can tell my options are:

1) use *addfiles* and read each individual time into the code using
"join" to create the new time dimension. The variable would then be
4-D and could be manipulated inside NCL. The problem is this seems to
be extremely slow just reading the files and loading the variable.

2) use a loop over time and load in files one at a time using *addfile*
- one student of mine claimed this is faster than using *addfiles*. Is
this true?

3) use *ncl_convert2nc* to convert the grib files to netcdf, concatenate
to one file with a time dimension and read that file into NCL. Is this
faster than using *addfiles*?

Why does it take NCL so much longer to read a grib or netcdf file than
Matlab?

The final 4-D variable is 489 Times x 60 Levels x 85 Lats x 512 Lons.
 The manipulations I need to perform are to fit to pressure levels from
model levels and then low-pass filter in time. This part of the code
seems to be working fine, the slow part is loading the data.

Thanks for any help, Yolande

Yolande Serra
serra@email.arizona.edu <mailto:serra@email.arizona.edu>

_______________________________________________
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
http://mailman.ucar.edu/mailman/listinfo/ncl-talk

_______________________________________________
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Wed Sep 12 17:10:40 2012

This archive was generated by hypermail 2.1.8 : Fri Sep 21 2012 - 16:22:30 MDT