Re: Concatenation of multiple files

From: David Brown <dbrown_at_nyahnyahspammersnyahnyah>
Date: Mon Mar 11 2013 - 15:21:20 MDT

Hi Nate,
Are there any error messages prior to the seg fault?
What is more important than the file size is the variable size. It would be good if you can send us the output of
ncl_filedump on one of these files. If the variable size (that is the variable "G" in your code) exceeds the amount of memory available on your system you are likely to have problems. If you have a 32-bit system there is a hard limit of 2 GB for a single variable. Otherwise, the variable size is just constrained by the amount of memory you have.

Depending on the dimensionality of the variable "G" there may be different ways to break it in to subsections for processing in smaller chunks. But the best way to do that depends both on the dimensionality and what sort of processing you want to do on the data. So it is important to figure out (at least approximately) how much memory the variable would occupy if there were no memory restrictions. Then based on the dimensionality you can decide how to read it in chunks. For example, assuming the data has multiple levels and also assuming the size of a single level's worth of data would allow the variable to fit into the available memory space, you might figure out a way to process each level individually. In that case your code would look something this (assume the dimensions of the variable are time, lev, lat, lon -- lev, lat, and lon must have the same dimensions in each file):

f = addfiles(pathi+fili+".nc","r")

dim_names = getvardims(f)
lev_ind = ind(dim_names .eq. "lev")
dim_sizes = getfiledimsizes(f)

agg_result = new((/ <size and dimensionality depends on what kind of processing you are doing> /), float_or_whatever_type)

do i = 0, dim_sizes(lev_ind) - 1
        G = f[:]->GPP(:,i,:,:)
        ; process G and save to an aggregate result variable -- again the dimensionality of the result depends on the type of processing
        agg_result( i) = process(G)
end do

; code for combining and further processing of the results if necessary

Hope this helps.
On Mar 8, 2013, at 9:35 AM, Nate Mikle wrote:
> Hello NCL users,
> I am new to both NCL and writing scripts in general, so I apologize in advance for simple mistakes and questions.
> I am trying to concatenate 12 different files that are temporally adjacent (monthly) into one file. I want to do this so I can then handle them all together in order to aggregate time (3 hourly) into a daily time step.
> If there are any ideas about how to do this or a better direction for me to go they would be greatly appreciated.  Thanks in advance.
> -Nate Mikle
> Here is the script I have so far, it gives me a segmentation fault. Does this mean the files are too large (each is 2200 MiB), if so how do I make them smaller-delete unnecessary varaibles?
> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"
> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_csm.ncl"
> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"
> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/shea_util.ncl"
> begin
> pathi = "/home/mikl6340/MsTMIP_Model/CLM-VIC/2000/"
> fili = (/"BG1_CLM4VIC_v1_3hourly_2000-01","BG1_CLM4VIC_v1_3hourly_2000-02","BG1_CLM4VIC_v1_3hourly_2000-03","BG1_CLM4VIC_v1_3hourly_2000-04","BG1_CLM4VIC_v1_3hourly_2000-05","BG1_CLM4VIC_v1_3hourly_2000-06","BG1_CLM4VIC_v1_3hourly_2000-07","BG1_CLM4VIC_v1_3hourly_2000-08","BG1_CLM4VIC_v1_3hourly_2000-09","BG1_CLM4VIC_v1_3hourly_2000-10","BG1_CLM4VIC_v1_3hourly_2000-11","BG1_CLM4VIC_v1_3hourly_2000-12"/)
> f = addfiles(pathi+fili+".nc","r")
> ListSetType(f,"cat")
> G = f[:]->GPP
> printVarSummary(G)
> end
> _______________________________________________
> ncl-talk mailing list
> List instructions, subscriber options, unsubscribe:
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
Received on Mon Mar 11 13:21:30 2013

This archive was generated by hypermail 2.1.8 : Wed Mar 13 2013 - 14:19:38 MDT