Re: problem with indexing when spanning multiple files

From: David Brown <dbrown_at_nyahnyahspammersnyahnyah>
Date: Fri Dec 19 2008 - 17:20:16 MST

No I don't think this solves his problem.

Assuming he has a (time,lev,lat,lon) dimensioned variable, he is
concatenating two files that
both have a time dimension, say (0,3,6) and (9,12,15,18).
If he does
ph = a[:]->PH
then he gets a variable with timesteps (0,3,6,9,12,15,18).

Now imagine he just wants the second timestep, so he does:
ph = a[:]->PH(1,:,:,:)

Note that this timestep is entirely contained within the first file.
However, what NCL currently does is apply this
subscript to every file in the list. Since it is only 1 element of
the concatenated dimension, the dimensionality is
reduced by 1. Then, since it is in concatenation mode, it
concatenates what is now the first dimension, (i.e. the
level dimension). It has 2 3-D variables and concatenates the
leftmost dimension giving (in this case) a double-sized
dimension. If there were 3 files it would be 3 times the size, etc.

It doesn't work if you set the mode to "join" either. In that case,
it still applies the same subscript to all the files, and
ends up again selecting the the 2nd timestep of each file. In "join"
mode it then creates a new dimension so you
end up (in the case of 2 files) with a variable dimensioned (2,:,:,:).

This is another symptom of the same fundamental problem as the
earlier issue we discovered involving striding
through an aggregated variable created using addfiles: the stride
starts over for each file, whether you want it to or not.
Essentially, Ethan never implemented code to treat the virtual
aggregated dimension on an equal basis with the other actual dimensions.

I am not going to be able to fix this over night. It is going to
require some new infrastructure. If I do this I could also fix the
striding problem.
I do think it should be fixed, but if we want it to go into this NCL
release we are going to need to delay the code freeze.
  -dave

On Dec 19, 2008, at 4:35 PM, Dennis Shea wrote:

> Hi Franco,
>
> the default most of addfiles is to *concatenate*
> records over the leftmost dimension.
>
> I think what you want is:
>
> ph_1 = a[:]->PH(:,1,:,:)
> and, not
> ph_1 = a[:]->PH(1,:,:,:)
>
> If the latter is what you want try
> ListSetType[f, "join"]
>
> Good luck
> D
>
>
> franco.catalano@uniroma1.it wrote:
>> Hi ncl team,
>> I have found a problem with variable indexing when spanning
>> multiple files.
>> I have two big wrf output files:
>> wrfout-00-09.nc which contains the first 9 hours of simulation;
>> wrfout-09-18.nc which contains the subsequent 9 hours.
>> I read both files with the following commands:
>> all_files = systemfunc("ls /data/wrfout*")
>> a = addfiles (all_files,"r")
>> If I read all time instants of variable, it works fine:
>> phb = a[:]->PH(:,:,:,:)
>> and the dimensions of ph are correct:
>> print(dimsizes(ph))
>> gives me the expected output:
>> (0) 108
>> (1) 58
>> (2) 99
>> (3) 359
>> If, instead, I read the same variable, but only for a specified
>> time instant, let's say 1:
>> http://www.ncl.ucar.edu
>> the second dimension, z, is read with a dimension which is double
>> as the correct one:
>> (0) 116
>> (2) 99
>> (3) 359
>> This is a serious problem when working with large datasets, since
>> reading the whole variables is time and memory consuming. This
>> could be avoided with a correct indexed reading.
>> Thank you for your kindness.
>> Franco Catalano
>> ____________________________________________________
>> Eng. Franco Catalano
>> Ph.D. Student
>> D.I.T.S.
>> Department of Hydraulics, Transportation and Roads.
>> Via Eudossiana 18, 00184 Rome
>> Sapienza University of Rome.
>> tel: +390644585218
>> ---------------------------------------------------------------------
>> ---
>> _______________________________________________
>> ncl-talk mailing list
>> List instructions, subscriber options, unsubscribe:
>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
> _______________________________________________
> ncl-talk mailing list
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Fri Dec 19 17:20:19 2008

This archive was generated by hypermail 2.1.8 : Mon Nov 15 2010 - 12:52:15 MST