addfiles
Creates a reference that spans multiple data files.
Prototype
function addfiles ( file_path [*] : string, status : string ) return_val [1] : list
Arguments
file_pathA one-dimensional array of strings containing the full or relative path of the data files to be referenced.
statusSingle string that specifies whether the files should be opened as read-only ("r") , read-write ("w") or create ("c").
Description
The addfiles function provides the user with the ability to access data spanning multiple files. The function returns a single variable of type list containing a list of references to the files pointed to by the file_path argument. Files pointed to by the file_path string must be in a supported file format and have a supported file extensions at the end of each file name. The extension is required even though it need not be part of the actual filename. The currently supported formats, valid status values, and accepted extensions are:
- NetCDF ("r", "w", "c")
- ".nc", ".cdf", ".netcdf"
- GRIB versions 1 and 2 ("r") (GRIB2 support available in versions 4.3.0 or later. )
- ".gr", ".gr1", ".grb", ".grib", ".grb1", ".grib1", ".gr2", "grb2", ".grib2"
- HDF ("r", "w", "c")
- ".hdf", ".hd"
- HDFEOS ("r")
- ".hdfeos", "he2", "he4"
- CCM ("r")
- ".ccm"
addfile handles these extensions in a case-insensitive manner: ".grib", ".GRIB", and ".Grib" all indicate a GRIB file.
If the status "c" is set, the file is created if it doesn't exist. If it does exist, an error message is printed and the default missing value for files is returned. If "w" is set, and the files all exist and have permissions that allow for reading and writing, then the files are opened for reading and writing. If any of these conditions fail, an error message is reported and the default file missing value is returned. Similarly, if "r" is set, the files must exist and the user must have read permissions on those files. Otherwise, an error message is printed and the default missing value is returned. See the ismissing function on how to detect the returned missing value in a program.
The addfiles function differs from the addfile function in several ways:
- addfile creates only one file reference while
addfiles provides for multiple file references.
- The variable returned by addfiles cannot be
used as input to the "getfilexxxx" suite of functions. Instead,
you can only input one element of this variable; i.e:
files = systemfunc("ls *.nc") f = addfiles(files,"r") dsizes = getfiledimsizes(f[0]) - When a variable is input via a reference generated by
addfile, all values and, if present, all attributes
and coordinate variables are input. A variable input via the reference
generated by addfiles will have the values-only
input. No attributes or coordinate variables will be input. It is the
user's responsibility to attach these metadata, or the addfiles_GetVar function can be used to
accomplish this.
- Data input via addfiles may be created via two different options, "join" and "cat" (the default), as specified by the ListSetType procedure.
Under what conditions should the "cat" (default) and "join" options be used? Generally speaking, if the leftmost dimension of a variable is a "record" dimension (say, "time"), then the "cat" option is best. If, however, there is no record dimension (e.g. [lev,lat,lon]), then the "join" option is appropriate. One exception to the general rule of using "cat" is when there is a record dimension as outlined in example 3.
Note that if you use the "join" option and a command like systemfunc ("ls *.nc") to get a list of the netCDF files, then you need to make sure that the "ls" command gives you the files in the correct order that you want them joined.
See Also
addfiles_GetVar, addfile, ListSetType, ListGetType
Examples
Example 1
Read in a series of netCDF files (here, 5 files each with 12 time steps), and read into memory the four dimensional variable T(ntim,klvl,nlat,mlon), where ntim=12, klvl=5, nlat=48, mlon=96:
diri = "/fs/cgd/data0/casguest/CLASS/" ; input directory fils = systemfunc ("ls "+diri+"ann*.nc") ; file paths f = addfiles (fils, "r") ListSetType (f, "cat") ; concatenate (=default) T = f[:]->T ; read T from all files printVarSummary (T)The printVarSummary procedure yields:
Variable: T
Type: float
Total Size: 5529600 bytes
1382400 values
Number of Dimensions: 4
Dimensions and sizes: [60] x [5] x [48] x [96]
Coordinates:
The size of the time dimension is now 60 (=5*12), while the other
dimensions remain the same. Note also that no metadata has been
copied. If this information is desired, the user must do it. For
example:
T!0 = "time" T!1 = "lev" T!2 = "lat" T!3 = "lon" T&time = f[:]->time ; time coord variable T&lev = f[0]->lev ; get lev from the 1st file T&lat = f[0]->lat ; get lat from the 1st file T&lon = f[0]->lon ; get lon from the 1st file T@long_name = "temperature" T@units = "K" printVarSummary (T)The printVarSummary yields:
Type: float
Total Size: 5529600 bytes
1382400 values
Number of Dimensions: 4
Dimensions and sizes: [time | 60] x [lev | 5] x [lat | 48] x [lon | 96]
Coordinates:
time: [2349..4143]
lev: [850000..250]
lat: [-87.15909..87.15909]
lon: [ 0..356.25]
Number Of Attributes: 2
units : K
long_name : temperature
Example 2The "XXX" files have no record dimension. All records are 5 (levels) x 48 (latitudes) x 96 (longitudes). Here we use the "join" option. This adds an extra dimension.
diri = "/fs/cgd/data0/casguest/CLASS/" ; input directory fils = systemfunc ("ls "+diri+"XXX*.nc") ; file paths f = addfiles (fils, "r") ; note the "s" of addfile ListSetType (f, "join") T = f[:]->T ; read T from all files printVarSummary (T)The printVarSummary procedure yields:
Variable: T
Type: float
Total Size: 460800 bytes
115200 values
Number of Dimensions: 4
Dimensions and sizes: [5] x [5] x [48] x [96]
Coordinates:
The user can add metadata explicitly. For example:
T!0 = "case" ; arbitrary name T!1 = "lev" T!2 = "lat" T!3 = "lon" T&lev = f[0]->lev ; get lev from the 1st file T&lat = f[0]->lat ; get lat from the 1st file T&lon = f[0]->lon ; get lon from the 1st file T@long_name = "temperature" T@units = "K" printVarSummary (T)yields:
Variable: T
Type: float
Total Size: 460800 bytes
115200 values
Number of Dimensions: 5
Dimensions and sizes: [case | 5] x [lev | 5] x [lat | 48] x [lon | 96]
Coordinates:
lev: [850000..250.]
lat: [-87.15909..87.15909]
lon: [ 0..356.25]
Number Of Attributes: 2
units : K
long_name : temperature
Example 3Generally, when there is a record dimension one uses the "cat" option. In this example, let's assume the five different runs were made for a particular year. Each run was done using, say, different boundary layer parameterizations. Here the time variable is the same for each file and we want to compare the five different cases. The appropriate choice for this case is "join":
diri = "/fs/cgd/data0/casguest/CLASS/" ; input directory fils = systemfunc ("ls "+diri+"Bound*.nc") ; file paths f = addfiles (fils, "r") ; note the "s" of addfile ListSetType (f, "join") T = f[:]->T ; read T from all files printVarSummary (T)The printVarSummary procedure yields:
Variable: T
Type: float
Total Size: 5529600 bytes
1382400 values
Number of Dimensions: 5
Dimensions and sizes: [5] x [12] x [5] x [48] x [96]
Coordinates:
The user can add metadata explicitly. For example:
T!0 = "case" ; arbitrary name T!1 = "time" T!2 = "lev" T!3 = "lat" T!4 = "lon" T&time = f[0]->time ; time coord variable T&lev = f[0]->lev ; get lev from the 1st file T&lat = f[0]->lat ; get lat from the 1st file T&lon = f[0]->lon ; get lon from the 1st file T@long_name = "temperature" T@units = "K" printVarSummary (T)yields:
Variable: T
Type: float
Total Size: 5529600 bytes
1382400 values
Number of Dimensions: 5
Dimensions and sizes: [case | 5] x [time | 12] x [lev | 5] x [lat | 48] x [
lon | 96]
Coordinates:
time: [2349..2683]
lev: [850..250.]
lat: [-87.15909..87.15909]
lon: [ 0..356.25]
Number Of Attributes: 2
units : K
long_name : temperature
Example 4As noted, addfiles does not result in metadata being attached to the variable read from the files. However, there is a function called addfiles_GetVar that will automatically attach metadata. It can result in much cleaner code, especially when many variables have to be read. The following concatenates the records:
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl" begin diri = "/fs/cgd/data0/casguest/CLASS/" fils = systemfunc ("ls "+diri+"annual*") f = addfiles (fils+".nc", "r") ; note the "s" of addfile T = addfiles_GetVar (f, fils, "T") printVarSummary (T) endThe output from printVarSummary is:
Variable: T
Type: float
Total Size: 5529600 bytes
1382400 values
Number of Dimensions: 4
Dimensions and sizes: [time | 60] x [lev | 5] x [lat | 48] x [lon | 96]
Coordinates:
time: [2349..4143]
lev: [850000..-72361.58]
lat: [-87.15909..87.15909]
lon: [ 0..356.25]
Number Of Attributes: 3
missing_value : 1e+36
units :
long_name : temperature
Example 5This example is similar to example 4, but here the ListType is set to "join" using the ListSetType function. The addfiles_GetVar function in "contributed.ncl" will name the extra dimension "case":
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl" begin diri = "/fs/cgd/data0/casguest/CLASS/" fils = systemfunc ("ls "+diri+"ANNUAL*") f = addfiles (fils+".nc", "r") ; note the "s" of addfile ListType = "join" ListSetType (f, ListType ) T = addfiles_GetVar (f, fils, "T") printVarSummary (T) endThe output yields:
Variable: T
Type: float
Total Size: 5529600 bytes
1382400 values
Number of Dimensions: 5
Dimensions and sizes: [case | 5] x [time | 12] x [lev | 5] x [lat | 48] x [lon
| 96]
Coordinates:
case: [0..4]
time: [2349..2683]
lev: [850000..-72361.58]
lat: [-87.15909..87.15909]
lon: [ 0..356.25]
Number Of Attributes: 3
missing_value : 1e+36
units :
long_name : temperature