Re: sorting out the files

From: Jonathan Vigh <jvigh_at_nyahnyahspammersnyahnyah>
Date: Tue Apr 27 2010 - 16:36:00 MDT

Hi Prabhakar,
    If you have a small number of files, then David Brown's solution is
the easiest But if you have many files, or you are not able to
physically rename them (for example, if they are on a remote server or
you don't have write permission), then you have to be clever.

What you can do is to read it the filenames, parse them into their
separate parts, rewrite the file names in a way that can be sorted by
NCL's string sort function (sqsort). If you first attach the original
filenames as an attribute array to the list of new filenames, then you
sort on the rehashed filenames and the attached attribute array of old
names will be sorted into the correct order. You can then use that list
of the old names sorted in the correct order.

This might sounds hard, but it's really not too bad. I've attached a
script that will do this for you.

Best regards,
   Jonathan

p s wrote:
> Hi,
> I am trying to read multipled 3B42 hdf 3hourly files and write it out
> as a single netcdf file. The problem I am coming across is due to the
> way the HDF filenames are:
>
>
> 3B42.090609.0.6A.HDF
> 3B42.090609.12.6A.HDF
> 3B42.090609.15.6A.HDF
> 3B42.090609.18.6A.HDF
> 3B42.090609.21.6A.HDF
> 3B42.090609.3.6A.HDF
> 3B42.090609.6.6A.HDF
>
>
>
> When I use systemfunc to list the files and read in, they do not
> appear in sequential time order, like 0, 3, 6, 15, 18, 21 because of
> the size of hr array in filename
> so the data is stored in netcdf in different time order (0, 12, 15,
> 18, 21, 3, 6), which I want to correct.
>
> Could you please kindly help me to sort out this issue.
>
> Regards,
> Prabhakar
> ------------------------------------------------------------------------
>
> _______________________________________________
> ncl-talk mailing list
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>

load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_csm.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/shea_util.ncl"

begin

  filenames = systemfunc("ls 3B42*.HDF") ; get list of the filenames
  nfiles = dimsizes(filenames)

; now extract each part of the filename and store it separately
  type = str_get_field(filenames,1,".")
  date = str_get_field(filenames,2,".")
  hour = str_get_field(filenames,3,".")
  level = str_get_field(filenames,4,".")
  extension = str_get_field(filenames,5,".")
  
  print(hour) ; note that we now have a list of strings for hour
  
; now add some zeroes using sprinti - must convert the old string to an integer before giving it to sprinti!
  new_hour = sprinti("%0.2i",stringtoint(hour))

  print(new_hour) ; note that this worked as expected - now we have the strings with zeroes so they can be sorted properly

; reconstruct the filenames with the new hour strings
  new_filenames = type + "." + date + "." + new_hour + "." + level + "." + extension

  print(new_filenames) ; now we have a list that we can sort

  new_filenames!0 = "old_names" ; define a named dimension for the old names
  new_filenames&old_names = filenames ; now store the original filenames as an attribute array of the new_filenames variable
  
  sqsort(new_filenames) ; now do a string sort so that the filenames are in order; the associated attribute array (old filenames) gets sorted according to the order of new_filenames
  
  print(new_filenames) ; verify that this had the desired effect with the new filenames
  print(new_filenames&old_names) ; now you have a list of the old filenames, but in the correct order

; fins = addfiles(new_filename&old_names,"r") ; now you can open all the files together, or loop through one by one

  do ifile = 0, nfiles-1
     print(new_filenames&old_names(ifile))
; fin = addfiles(new_filename&old_names(ifile),"r")
; do stuff now then delete the file before opening up the next one
; delete(fin)
  end do

end

_______________________________________________
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Tue Apr 27 17:26:22 2010

This archive was generated by hypermail 2.1.8 : Thu Apr 29 2010 - 08:05:27 MDT