Re: [spam] suggestions for reducing the run time for the script (ascii writing)i

From: Dennis Shea <shea_at_nyahnyahspammersnyahnyah>
Date: Sun, 06 Sep 2009 08:08:47 -0600

[1]
In interpreted languages like NCL [IDL, Matlab, ...] it is
best to minimize use of 'do' loops. This is done via array
syntax or array functions.

[2]
For *every* iteration [innermost time loop], you are doing a
system call to append a value

    system("echo " + data_app+ ">>" + fNameO)

This is *very* inefficient

[3]
The "data(npt) = " sequence for every iteration is also inefficient.

[4]
Attached is some code which will probably work much more efficiently.
You will have to fill-in/change where necessary.

Good luck

Anandhi Aavudai wrote:
> Dear NCL users,
>
> I have written a NCL script in version 5.1.1. It uses Cygwin in windows.
>
> The script reads IPCC data (for the whole world, at daily time step) in netcdf format and writes a region (Eastern North America bounded by lat=25 to 50 deg. N; lon=275 to 300 deg E) to ascii file.
>
> The ascii file has matrix comprising of coumns the time-step, lat, lon & value of the variable. The sample output file is
>
> time lat lon var. val(in this case precipitation)
> "0000058804.5 26.5 275.6 0.06
> 0000058805.5 26.5 275.6 0.38"
>
> The script runs without error., however it takes a very long time (~10 hrs) for a 10 yr daily data (for 1322*9*9 lines). I think the 4 do loops is reducing the speed. The 4 loops: one loop for the 145 files i need to read, the rest 3 are for extracting the value lat, lon & time.
>
> The script is pasted below. Can this script be improved for efficiency to run faster? I am new to ncl . Any suggestion in this direction would be very useful.
>
> ;------program to subset a netcdf file for ENA----
> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"
> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_csm.ncl"
> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"
> load "$NCARG_ROOT/lib/ncarg/nclscripts/wrf/WRFUserARW.ncl"
> begin
>
> ;=========================================;
> ; step 1: get the list of netcdf files from a directory
> ;=========================================;
>
> dir = "E:/IPCC_data_copy/20c3m_atm_da/pr"
>
> dblquote = integertochar(34)
>
> fili = systemfunc("find " + dir + "/* -type f -name " + dblquote + "*.nc" + dblquote + " -print")
>
> ;t(fili)
> nfili = dimsizes(fili)
> ;print(nfili)
>
> ;=========================================;
> ; step 2: loop to read one netcdf file at a time from a list of files
> ;=========================================;
>
> do nf=0,nfili-1
> in= addfile (fili(nf), "r")
> orig = in->pr
> time = in->time
> lat = in->lat
> lon = in->lon
> print(fili(nf))
>
> ;=========================================;
> ; step 3: create output file name(fnameo);
> ;=========================================;
>
> fout=new(3,"string") ; defing fout ass an array of strings
> fout(0)="E:/IPCC_data_extract_ENA/20c3m_atm_da" ; adding a folder name
> fex=stringtochar(fili(nf));
> fout(1)=chartostring(fex(30:dimsizes(fex)-5)) ; extracting a portion of the file name from input file
> fout(2)="_ENA_out.txt"
> fNameO=str_concat(fout)
> print(fNameO)
>
> ;=========================================;
> ; step 5: for a specified region get the list of lat (lat_v),
> ; lon(lon_v); its dimension values(lon_d),(lat_d)
> ;=========================================;
> lats = 25 ; lower limit of lat
> latn = 50 ; upper limit of lon
> lonw = 275 ; left limit of lon
> lone = 300 ; right limit of lon
> lat_v = lat(ind(lat.gt.lats .and. lat.lt.latn))
> lat_d = ind(lat.gt.lats .and. lat.lt.latn)
> nlat= dimsizes(lat_d)
>
> lon_v = lon(ind(lon.gt.lonw .and. lon.lt.lone))
> lon_d = ind(lon.gt.lonw .and. lon.lt.lone)
> nlon=dimsizes(lon_d)
> ntim=dimsizes(time)
>
> ;=========================================;
> ; step 6: subset the value of variable for a specified region get
> ; the list of lat (lat_v),
> ; lon(lon_v); its dimension values(lon_d),(lat_d)
> ;=========================================;
>
> orig_ss = orig(:,{lat_v(0):lat_v(nlat-1)},{lon_v(0):lon_v(nlon-1)})
>
> ;============================================================================;
> ; step 7: write the lat, lon, time, var values for each output file (fName0)
> ; 3 loops to read every value of lat, lon, time & print to ascii file
> ;============================================================================;
>
> ; printVarSummary(in) ; prints the header information
> data=new(1,"string")
> do nl=0,nlat-1
> do ml=0,nlon-1
> do nt=0,ntim-1
> npt =0
> data(npt) = sprintf("%0.5i", (npt+1) )
> data(npt) = data(npt) + sprintf("%7.1f ",time(nt))
> data(npt) = data(npt) + sprintf("%7.1f ",lat_v(nl))
> data(npt) = data(npt) + sprintf("%4.1f ",lon_v(ml))
> data(npt) = data(npt) + sprintf("%6.2f ", orig_ss(nt,nl,ml)*86400)
> data_app= data(npt)
> ;filename = "C:/anandhi/Part2_GCM_evaluation_methods/analysis_GCM_evaluation/foo_ra1.txt"
> system("echo " + data_app+ ">>" + fNameO)
>
> end do
> end do
> end do
> ;asciiwrite (fNameO , data)
>
> end do
>
> end
> ;-----------------------------
> thank you all in advance for the support.
>
> regards
> anandhi
> _______________________________________________
> ncl-talk mailing list
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk

load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"

[your code to get files]

    diro = "./" ; output directory for ascii files

    latS = 25.
    latN = 50.
    lonL = 275.
    lonR = 300.

    do nf=0,nfili-1
       f = addfile (diri+fili(nf),"r")
                                                ; read subset only
       x = f->XXX(:,{latS:latN},{lonL:lonR}) ; (time,lat,lon)
   
       dimx = dimsizes( x )
       ntim = dimx(0)
       nlat = dimx(1)
       mlon = dimx(2)
   
       if (nf.eq.0) then
           npts = ntim*nlat*mlon
           data = new ( npts, "string" ) ; preallocate
           Time = sprintf("%12.1f" , x&time) ; do once: always the same
       end if
   
       nStrt = 0
       nLast = ntim-1
   
       wcStrt= systemfunc("date")
   
       do nl=0,nlat-1
         do ml=0,mlon-1
            LatLon = sprintf("%7.1f ",x&lat(nl)) + sprintf("%6.1f" , x&lon(ml))
            data(nStrt:nLast) = sprinti("%0.5i" , ispan((nStrt+1),(nLast+1),1) ) \
                              + Time + LatLon + sprintf("%8.2f",x(:,nl,ml))
            nStrt = nStrt + ntim
            nLast = nLast + ntim
         end do
       end do
   
       wallClockElapseTime(wcStrt, "Ascii: "+fili(nf) ,0) ; time loop
   
       sfx = get_file_suffix(fili(nf))
       filo = sfx_at_fBase + ".ascii" ; create ne name
   
       system("/bin/rm -f "+diro+filo)
       asciiwrite (diro+filo, data)
   end do ; nf loop

   end

_______________________________________________
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Sun Sep 06 2009 - 08:08:47 MDT

This archive was generated by hypermail 2.2.0 : Tue Sep 08 2009 - 11:49:50 MDT