Re: Writing time series ascii file for each grid cell - Is there a faster way?

From: Dennis Shea <shea_at_nyahnyahspammersnyahnyah>
Date: Wed Jan 25 2012 - 15:18:07 MST

Hi Bridget,

Wow ... 273252 files!
--------
As noted by DaveA, the creation of the 'dat' string for each grid
point could be VERY slow. If nothing else,

dat = sprinti("%0.4i",t(1:,0)) \
     + sprinti("%7.2i",t(1:,1)) \
     + sprinti("%7.2i",t(1:,2)) \
     + sprintf("%10.5f",pr(:,j)) \
     + sprintf("%10.5f",tmax(:,j)) \
     + sprintf("%10.5f",tmin(:,j)) \
     + sprintf("%10.5f",wind(:,j)) \
     + sprintf("%10.5f",sph(:,j)) \
     + sprintf("%10.5f",srad(:,j))

would be faster than multiple appends to the 'dat' variable.

-------

Use of 'write matrix' would require creation of a 'super array'
to handle the times and data values. Doable, but with the potential of
150 years of daily data, I think a simple fortran code would be
more efficient.

Attached you will find a test NCL driver script which call
a fortran subroutine. This is very fast.

[1]
  Create a directory called (say) TXT (or ascii/data )

[2]
%> WRAPIT write6var.f

[3]
Change data directory pointers as necessary, then

%> ncl grdpt.ncl

This creates ascii (text) files in the TEX directory with names like

bt.2503_23403.txt
bt.2503_23409.txt
bt.2509_23403.txt
bt.2509_23409.txt
[etc]

You can alter the file names in the NCL and/or fortran code.
The 1st 4 numerical characters refer to the latitude in hundredths.
Hence, 2503 => 25.03 degrees_north

The second set of 5 numerical characters refer to the longitude in
hundredths. Hence, 23403 => 234.03

==
I have placed the files at:
   http://www.cgd.ucar.edu/~shea/grdpt.ncl
   http://www.cgd.ucar.edu/~shea/write6var.f

==

If you want to test the fortran code to just use a few grid points,

Change

       do nl=1,nlat

        do ml=1,mlon

to, say

       do nl=1,2 ! these can be anything

        do ml=1,2

==
Good Luck
D

On 01/19/2012 04:26 PM, Dave Allured wrote:
> For starters, try switching from asciiwrite to write_matrix. See
> example 6 on the write_matrix doc page. I believe that multiple "dat"
> assignments and sprintf calls are extremely consumptive of CPU time
> and memory in NCL, for large arrays; and this is within the innermost
> loop where it really matters.
>
> For write_matrix, you will need an intermediate array ("p" in example
> 6) that contains all the numeric data for the rows and columns to be
> output. You will need to convert integers to floats for some of the
> input data; date/time values in particular.
>
> You can save more time and memory if you make this a 3-D array rather
> than 2-D, and load it directly from the file input statements which
> are in your outer loop, omitting the other input arrays pr, sph, srad,
> etc. Also avoid deleting the intermediate array if you don't need to,
> because it does not change shape between iterations.
>
> --Dave
>
> On Thu, Jan 19, 2012 at 12:16 PM, Bridget Thrasher
> <bthrasher@climatecentral.org> wrote:
>> I need to take many years of daily data for 6 variables and output their
>> time series together to individual ASCII files for each of 273252 grid
>> cells. I have code that works (see below), but when testing with a 32-year
>> series, it's looking to take 3+ weeks to finish. I will soon need to do this
>> many times over a 150-year series. Yikes! Am I going about this the wrong
>> way?
>>
>> -Bridget
>>
>> begin
>>
>> ls_pr = systemfunc("ls 6km/pr/*.nc")
>> ls_sph = systemfunc("ls 6km/sph/*.nc")
>> ls_srad = systemfunc("ls 6km/srad/*.nc")
>> ls_tmax = systemfunc("ls 6km/tasmax/*.nc")
>> ls_tmin = systemfunc("ls 6km/tasmin/*.nc")
>> ls_wind = systemfunc("ls 6km/wind/*.nc")
>>
>> f_pr = addfiles(ls_pr,"r")
>> f_sph = addfiles(ls_sph,"r")
>> f_srad = addfiles(ls_srad,"r")
>> f_tmax = addfiles(ls_tmax,"r")
>> f_tmin = addfiles(ls_tmin,"r")
>> f_wind = addfiles(ls_wind,"r")
>>
>> lats = f_pr[0]->lat
>> lons = f_pr[0]->lon
>> times = f_pr[:]->time
>> t = ut_calendar(times,-5)
>> ts = dimsizes(times)
>>
>> do i = 0,dimsizes(lats)-1
>> pr = f_pr[:]->pr(1:,i,:)
>> sph = f_sph[:]->sph(1:,i,:)
>> srad = f_srad[:]->srad(1:,i,:)
>> tmax = f_tmax[:]->tasmax(1:,i,:)
>> tmin = f_tmin[:]->tasmin(1:,i,:)
>> wind = f_wind[:]->wind(1:,i,:)
>> do j = 0,dimsizes(lons)-1
>> if
>> (ismissing(pr(0,j)).or.ismissing(sph(0,j)).or.ismissing(srad(0,j)).or.ismissing(tasmax(0,j)).or.ismissing(tasmin(0,j)).or.ismissing(wind(0,j)))
>> then
>> print("No data at "+lats(i)+"_"+(lons(j)-360.))
>> else
>> print("Making "+"data_"+lats(i)+"_"+(lons(j)-360.))
>> fname = "ascii/data_"+lats(i)+"_"+(lons(j)-360.)
>> dat = new((/ts-1/),"string","")
>>
>> dat = sprinti("%0.4i",t(1:,0))
>> dat = dat + sprinti("%7.2i",t(1:,1))
>> dat = dat + sprinti("%7.2i",t(1:,2))
>> dat = dat + sprintf("%10.5f",pr(:,j))
>> dat = dat + sprintf("%10.5f",tmax(:,j))
>> dat = dat + sprintf("%10.5f",tmin(:,j))
>> dat = dat + sprintf("%10.5f",wind(:,j))
>> dat = dat + sprintf("%10.5f",sph(:,j))
>> dat = dat + sprintf("%10.5f",srad(:,j))
>>
>> asciiwrite(fname,dat)
>> delete(dat)
>> end if
>> end do
>> delete(pr)
>> delete(sph)
>> delete(srad)
>> delete(tmax)
>> delete(tmin)
>> delete(wind)
>>
>> end do
>>
>> end
>>
>>
>> --
>> Bridget Thrasher, PhD
>> Independent Contractor, Research Scientist
>>
>>
>>
>> _______________________________________________
>> ncl-talk mailing list
>> List instructions, subscriber options, unsubscribe:
>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>>
> _______________________________________________
> ncl-talk mailing list
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk

_______________________________________________
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Thu Jan 26 03:18:20 2012

This archive was generated by hypermail 2.1.8 : Thu Feb 02 2012 - 03:10:31 MST