Re: Reading Binary files with mixed-dimensions

From: Dennis Shea <shea_at_nyahnyahspammersnyahnyah>
Date: Wed, 30 Jan 2008 10:10:47 -0700

Not sure what to say when you say "no documentation for how the binary
was written". Given the above lack of information, it is not
rerasonable to
expect any tool to automatically read the files.

Files: Flat or written as a fortran sequential file which has
         extra record information silently embedded

     Hopefully, you know the sizes of the dimension [klev1, klev2, nlat,
mlon].
    
     Each float is 4-bytes

     4 (bytes) *[ 4 (variables) * mlon*nlat*klev1 * 1 (variable)
*mlon*nlat*klev2 +
                        5 (variables) * mlon*nlat] = total # bytes for
flat binary

     If this number matches the number of bytes in the file, you know it is
     a flat file.

      setfileoption("bin","ReadByteOrder","BigEndian")
      x_1d = fbindirread("jma_file", -1, "float")

      **If** you knew the order you could do something like the following

      mn = nlat*mlon

      nStrt = 0
      nLast = nm-1

       a_2d = onedtond( x_1d(nStrt:nLast), (/nlat,mlon/) )
       nStrt = nLast+1
       nLast = nStrt + mn
       b_2d = onedtond( x_1d(nStrt:nLast), (/nlat,mlon/) )

      However, you should also know the meta data if you want
      to create a good netCDF file.

If the bytes do not match, it is a fortran sequential file ....
you would have to determine the information from the extra bytes.

David F Porter wrote:
> Dave,
>
> Well, I'm trying to read in the Japanese Reanalysis data, of which
> there is no documentation for how the binary was written. Most of the
> data is in GRIB, but monthly means are in binary. I've tried both
> direct and sequential access routines in NCL (attempting to just read
> in the first record of the file), and neither worked. But then again,
> I'm not positive of the order of the data in the files either.
Not sure what you mean ... "neither worked"
>
> To clarify, each file is for 1 time period (6-hourlies in GRIB,
> monthlies in binary). Each file contains 10 variables. The problem
> is, 4 variables are [lon,lat,lev] , one is the same but a different
> number of levels, and then 5 are just [lon,lat]. To make things more
> interesting, I'm not sure of the order that each variable was written
> (it is slow communicating with the JMA).
JMA should do a better job.
>
> I tried converting it to netCDF using IDL, which I've had success
> doing with other binary files by just pointing to the starting byte
> for each variable. I used the order of the variables in the
> documentation (anl_mdl listed here
> http://jra.kishou.go.jp/elements_en.html and also the order of
> variables given by the NCL function printFileVarSummary() after
> reading in the corresponding 6-hourly GRIB files (the same dimensions,
> just different time and format).

>
> Dave Porter
>
>
>
> On Jan 23, 2008, at 4:27 PM, Dave Allured wrote:
>
>> Dave,
>>
>> Can you be more specific as to the type of binary file? Fortran
>> "unformatted sequential access"; plain binary such as written by
>> Fortran direct access; or something else? I just want to be sure I'm
>> on the right wavelength before responding.
>>
>> Also, assuming one of the first two: So the record length varies
>> within each file? Does each file have its own unique layout, or is
>> each variable found in exactly the same position in every file?
>>
>> Dave Allured
>> CU/CIRES Climate Diagnostics Center (CDC)
>> http://cires.colorado.edu/science/centers/cdc/
>> NOAA/ESRL/PSD, Climate Analysis Branch (CAB)
>> http://www.cdc.noaa.gov/
>>
>> David F Porter wrote:
>>> Sorry if this has been covered, by I've exhausted the search
>>> function with no real results.
>>> I am looking to read in some large 4-Byte Float big-endian binary
>>> data onto my little-endian machine. The problem I am having is that
>>> each file corresponds to ONE time period, but each variable in the
>>> file has different dimensions, some 2D and some 3D. Because of the
>>> varying sizes, I feel that I cannot simply use the "record number".
>>> Also, I only want some of the variables (to save space after loading
>>> 300 of these files).
>>> I'm not sure if it matters at this point, but the variables are on a
>>> gaussian grid.
>>> Dave
>> _______________________________________________
>> ncl-talk mailing list
>> ncl-talk_at_ucar.edu
>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>
> _______________________________________________
> ncl-talk mailing list
> ncl-talk_at_ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk

-- 
======================================================
Dennis J. Shea                  tel: 303-497-1361    |
P.O. Box 3000                   fax: 303-497-1333    |
Climate Analysis Section                             |
Climate & Global Dynamics Div.                       |
National Center for Atmospheric Research             |
Boulder, CO  80307                                   |
USA                        email: shea 'at' ucar.edu |
======================================================
_______________________________________________
ncl-talk mailing list
ncl-talk_at_ucar.edu
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Wed Jan 30 2008 - 10:10:47 MST

This archive was generated by hypermail 2.2.0 : Thu Jan 31 2008 - 22:46:00 MST