Re: Concatenation of multiple files

From: Nate Mikle <natemikle_at_nyahnyahspammersnyahnyah>
Date: Mon Mar 11 2013 - 17:59:21 MDT

Hi David,

Thank you for the responses. There are no messages prior to the seg
fault. All of the files contain the same variables and dimensions, I am
specifically looking at GPP. Eventually what I want to have 1 continuous
file of daily observations of GPP from 2000-2004 for each 0.5 degree grid
cell. Right now I have 60 separate files, each consists of a 3 hourly
time-step. So, not only do I eventually want to concatenate the files, I
also would like to aggregate the time-step to daily.
Here are the results of ncl_filedump.
Variable: f
Type: file
filename: BG1_CLM4VIC_v1_3hourly_2000-01
path: BG1_CLM4VIC_v1_3hourly_2000-01.nc
   file global attributes:
      title : CLM4VIC v1.0 3-hourly output for MsTMIP simulation BG1 v1
      source : CLM4VIC v1.0
      model : CLM4VIC
      model_version : v1.0
      references : Li et al. (2011), Evaluating runoff simulations from the
Community
Land Model 4.0 using observations from flux towers and a mountainous
watershed, JGR-at
mos, 116, DOI:10.1029/2011JD016276
      contact : Maoyi Huang
      email : maoyi.huang@pnnl.gov
      experiment : BG1
      project : MsTMIP
      sim_version : v1
      comment : Global Baseline simulation (BG1), updated on Mon Oct 15
15:40:52 PDT 2
012,performed by Huimin Lei from Tsinghua University during his visit at
PNNL
      Conventions : CF-1.4
   dimensions:
      lon = 720
      lat = 360
      nbnds = 2
      ncl3 = 248
      time = 248
   variables:
      float lon ( lon )
         long_name : Longitude
         description : longitude at center of each grid cell
         units : degrees_east
         bounds : lon_bnds
      float lat ( lat )
         long_name : Latitude
         description : latitude at center of each grid cell
         units : degrees_north
         bounds : lat_bnds
      float lon_bnds ( lon, nbnds )
         long_name : Longitude west-east bounds
         description : (west boundary of grid cell, east boundary of grid
cell)
         units : degrees_east
      float lat_bnds ( lat, nbnds )
         long_name : Latitude south-north bounds
         description : (south boundary of grid cell, north boundary of
grid cell)
         units : degrees_north
      double time ( ncl3 )
         _FillValue : 9.969209968386869e+36
         long_name : Time middle averaging period
         description : julian days days since 1700-01-01 00:00:00 UTC for
middle time
 averaging period Proleptic_Gregorianc calendar
         units : days since 1700-01-01 00:00:00 UTC
      float GPP ( time, lat, lon )
         missing_value : -1e+34
         description : Rate of photosynthesis (always positive)
         units : kg C m-2 s-1
         long_name : Gross Primary Productivity
         _FillValue : -1e+34
      float NPP ( time, lat, lon )
         _FillValue : -1e+34
         long_name : Net Primary Productivity
         units : kg C m-2 s-1
         description : Net Primary Productivity (NPP=GPP-AutoResp,
positive into plan
ts)
         missing_value : -1e+34
      float TotalResp ( time, lat, lon )
         _FillValue : -1e+34
         long_name : Total Respiration
         units : kg C m-2 s-1
         description : Total respiration (TotalResp=AutoResp+heteroResp,
always posit
ive)
         missing_value : -1e+34
      float AutoResp ( time, lat, lon )
         _FillValue : -1e+34
         long_name : Autotrophic Respiration
         units : kg C m-2 s-1
         description : Autotrophic respiration rate (always positive)
         missing_value : -1e+34
      float HeteroResp ( time, lat, lon )
         _FillValue : -1e+34
         long_name : Heterotrophic Respiration
         units : kg C m-2 s-1
         description : Heterotrophic respiration rate (always positive)
         missing_value : -1e+34
      float Fire_flux ( time, lat, lon )
         _FillValue : -1e+34
         long_name : Fire emissions
         units : kg C m-2 s-1
         description : Flux of carbon due to fires (always positive)
         missing_value : -1e+34
      float NEE ( time, lat, lon )
         _FillValue : -1e+34
         long_name : Net Ecosystem Exchange
         units : kg C m-2 s-1
         description : Net Ecosystem Exchange
(NEE=HeteroResp+AutoResp-GPP, positive
into atmosphere)
         missing_value : -1e+34
      float Qh ( time, lat, lon )
         _FillValue : -1e+34
         long_name : Sensible heat
         units : W m-2
         description : Sensible heat flux into the boundary layer
(positive into atmo
sphere)
         missing_value : -1e+34
      float Qle ( time, lat, lon )
         missing_value : -1e+34
         description : Latent heat flux into the boundary layer (positive
into atmosp
here)
         units : W m-2
         long_name : Latent heat
         _FillValue : -1e+34

On Mon, Mar 11, 2013 at 4:21 PM, David Brown <dbrown@ucar.edu> wrote:

> Hi Nate,
> Are there any error messages prior to the seg fault?
> What is more important than the file size is the variable size. It would
> be good if you can send us the output of
> ncl_filedump on one of these files. If the variable size (that is the
> variable "G" in your code) exceeds the amount of memory available on your
> system you are likely to have problems. If you have a 32-bit system there
> is a hard limit of 2 GB for a single variable. Otherwise, the variable size
> is just constrained by the amount of memory you have.
>
> Depending on the dimensionality of the variable "G" there may be different
> ways to break it in to subsections for processing in smaller chunks. But
> the best way to do that depends both on the dimensionality and what sort of
> processing you want to do on the data. So it is important to figure out
> (at least approximately) how much memory the variable would occupy if there
> were no memory restrictions. Then based on the dimensionality you can
> decide how to read it in chunks. For example, assuming the data has
> multiple levels and also assuming the size of a single level's worth of
> data would allow the variable to fit into the available memory space, you
> might figure out a way to process each level individually. In that case
> your code would look something this (assume the dimensions of the variable
> are time, lev, lat, lon -- lev, lat, and lon must have the same dimensions
> in each file):
>
> f = addfiles(pathi+fili+".nc","r")
> ListSetType(f,"cat")
>
> dim_names = getvardims(f)
> lev_ind = ind(dim_names .eq. "lev")
> dim_sizes = getfiledimsizes(f)
>
> agg_result = new((/ <size and dimensionality depends on what kind of
> processing you are doing> /), float_or_whatever_type)
>
> do i = 0, dim_sizes(lev_ind) - 1
> G = f[:]->GPP(:,i,:,:)
> ; process G and save to an aggregate result variable -- again the
> dimensionality of the result depends on the type of processing
> agg_result( i) = process(G)
> end do
>
> ; code for combining and further processing of the results if necessary
>
>
> ---
> Hope this helps.
> -dave
>
>
> On Mar 8, 2013, at 9:35 AM, Nate Mikle wrote:
>
> > Hello NCL users,
> >
> > I am new to both NCL and writing scripts in general, so I apologize in
> advance for simple mistakes and questions.
> >
> > I am trying to concatenate 12 different files that are temporally
> adjacent (monthly) into one file. I want to do this so I can then handle
> them all together in order to aggregate time (3 hourly) into a daily time
> step.
> >
> > If there are any ideas about how to do this or a better direction for me
> to go they would be greatly appreciated. Thanks in advance.
> >
> > -Nate Mikle
> >
> > Here is the script I have so far, it gives me a segmentation fault. Does
> this mean the files are too large (each is 2200 MiB), if so how do I make
> them smaller-delete unnecessary varaibles?
> >
> >
> > load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"
> > load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_csm.ncl"
> > load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"
> > load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/shea_util.ncl"
> >
> > begin
> >
> > pathi = "/home/mikl6340/MsTMIP_Model/CLM-VIC/2000/"
> > fili =
> (/"BG1_CLM4VIC_v1_3hourly_2000-01","BG1_CLM4VIC_v1_3hourly_2000-02","BG1_CLM4VIC_v1_3hourly_2000-03","BG1_CLM4VIC_v1_3hourly_2000-04","BG1_CLM4VIC_v1_3hourly_2000-05","BG1_CLM4VIC_v1_3hourly_2000-06","BG1_CLM4VIC_v1_3hourly_2000-07","BG1_CLM4VIC_v1_3hourly_2000-08","BG1_CLM4VIC_v1_3hourly_2000-09","BG1_CLM4VIC_v1_3hourly_2000-10","BG1_CLM4VIC_v1_3hourly_2000-11","BG1_CLM4VIC_v1_3hourly_2000-12"/)
> >
> > f = addfiles(pathi+fili+".nc","r")
> >
> > ListSetType(f,"cat")
> > G = f[:]->GPP
> >
> > printVarSummary(G)
> >
> > end
> > _______________________________________________
> > ncl-talk mailing list
> > List instructions, subscriber options, unsubscribe:
> > http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>
>

_______________________________________________
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Mon Mar 11 15:59:32 2013

This archive was generated by hypermail 2.1.8 : Wed Mar 13 2013 - 14:19:38 MDT