Data files for some examples
Example pages containing:
HDF: Hierarchical Data Format
HDF-SDS: Scientific Data Set
HDF-EOS: Earth Observing System
HDF and HDF-EOS
In 1993, NASA chose the Hierarchical Data Format Version 4 (HDF4)
to be the official format for all data products derived by the
Earth Observing System (EOS). It is commonly used for
satellite based data sets. There are several 'models' of
HDF4 which can be a bit confusing. Each model addresses
different needs. One model is the Scientific Data Set (SDS) which is
similar to netCDF-3 (Classic netCDF). It supports multi-dimensional
gridded data with meta-data including support for an unlimited dimension.
To better serve a broader spectrum within the user community
with needs for geolocated data, a new format or convention,
HDF-EOS2 was developed.
HDF-EOS supports three geospatial data types: grid,
point, and swath.
Limitations of the HDF4 format, needs for improved data compression and
changing computer paradigms led to the introduction of a
new HDF format (HDF5) in 2008. The HDF4 and HDF5 appellations
might imply some level of compatibility.
Unfortunately, no! These are completely independent formats.
The calling interfaces and underlying storage formats are different.
Unlike the netCDF community which has a well established
history of conventions
(eg., Climate and Forecast convention),
there seems to be a lack of commonly accepted conventions by the
HDF satellite community. For example the "units" for geographical coordinates under the
CF convention must be recognized by the
udunits package. Often on HDF files the latitude and longitude
variables will have the units "degrees" while netCDF's CF convention
would require "degrees_north" and "degrees_east" or some other recognized units.
Some experiments (eg
have developed "guidelines" for data. However, perhaps because the spectrum
of experiments is large, the HDF community does not have
a "culture" where broadly accepted
conventions are commonly used.
NCL General Comments
recognizes and supports multiple data formats
HDF4, HDF5, HDF-EOS2 and HDF-EOS5. The following HDF
related file extensions are recognized: "hdf", "hdfeos",
"he2", "he4", and "he5".
The first rule of 'data processing' is to look at the data.
The command line utility
can be used to examine a file's contents. Information such as a variable's type,
size, shape can allow users to develop optimal code for processing.
The stat_dispersion and
can be used to examine a variable's distribution. It is not
uncommon for outliers to be present. If so, it is best to manually specify
the contour limits and spacing to maximize information content
on the plots.
A possible source of confusion is that variables that are "short" or "byte"
can be unpacked to float via two different formulations. If x
represents a variable of type "short" or "byte", the unpacked
or scaled value can be derived via:
value = x*scale_factor + add_offset
value = (x - add_offset)*scale_factor
Examples of files that use the latter formula are at MYDATML_2, MYD04_L2
The NCL functions
be used to unpack the former, while,
be used to unpack the latter.
FYI: An Inconsistency between Reading HDF4 and HDF-EOS2
The following does not apply for NCL versions 6.2.0 and newer.
NCL v6.2.0 perform a 'double read' HDF and HDF-EOS and merges the
appropriate meta data.
An issue which may confuse users reading HDF-EOS2 files is that a variable
imported after a file has been opened with a .hdf extension may
have different meta data associated with it then if it had been imported
after being opened with a .hdfeos or .he2 extension. The reason
is that although HDF-EOS2 library has an interface for getting variable attributes,
many of the attributes that would be visible when reading using the
straight HDF library are not accessible using the HDFEOS library.
Conversely, the coordinates are often only visible using the HDF-EOS2 library. So,
currently, the only solution is for the user to examine the variable by
opening the file via each extension (ie: a 'double read').
Data used in the Examples
Regardles of the dataset used, the same principle can be
used to process the data.
TRMM - Tropical Rainfall Measuring Mission
MODIS - Moderate Resolution Imaging Spectroradiometer
SeaWiFS- Sea-viewing Wide Field-of-view Sensor [ SeaWiFS examples ]
HDF4EOS [he2, he4, hdfeos]
AIRS - Atmospheric Infrared Sounder
MODIS - Moderate Resolution Imaging Spectroradiometer
HIRDLS - High Resolution Dynamics Limb Sounder
MLS - Microwave Limb Sounder
OMI - Ozone Monitoring Instrument
TES - Tropospheric Emission Spectrometer
HDF Group Comprehensive Examples
The HDF-Group has created a suite of
NCL, Matlab, IDL and Python examples
which includes both scripts and the created graphical images.
They also provide NCL specific comments
Read a TRMM
file containing 3-hourly
precipitation at 0.25 degree resolution. The geographical
extent is 40S to 40N. This data is from the
Tropical Rainfall Measuring Mission (TRMM).
Create a packed netCDF
using the pack_values function.
This creates a file half the size of those created
using float values. Some precision is lost but is not important here.
This HDF file is classifed as a "Scientific
Data Set" (HDF4-SDS).
Unfortunately, the file is not 'self-contained'.
because the file contnts do not contain the geographical coordinates
or temporal information The former must be obtained via
a web site while the time is embedded within the file name.
Read an HDF4-SDS
file that contains high resolution (1km)
data over India and Sri Lanka. The file does not explicitly contain
any coordinate arrays. However, the variable on the file
"Mapped_Composited_mapped" does have the following attributes:
Slope : 1
Intercept : 0
Scaling : linear
Limit : ( 4, 62, 27, 95 )
Projection_ID : 8
Latitude_Center : 0
Longitude_Center : 78.5
Rotation : 0
attributes would indicate that no scaling has been
applied to the data. The Limit
attribute indicates the geographical
limits and the Latitude_Center/Longitude_Center/Rotation
specify map attributes. The variable does not specify any
missing value [_FillValue] attribute but, after looking at the
data, it was noted that the value -1 is appropriate.
The stat_dispersion function was
used to determine the standard and robust estimates of the
variable's dispersion. Outliers are present and the contour
information was manually specified.
: Read multiple files
(here, 131 files) for one particular day; for each file
bin and sum the satellite data using bin_sum
after all files have been read, use bin_avg
to average all the summed values; plot; create a netCDF
of the binned (gridded data).
Here the data are netCDF files. However, the original files were
HDF-SDS (eg: MYD06_L2.A2005364.1405.005.2006127140531.hdf). The
originating scientist converted these to netCDF for some reason.
NCL can handle either. Only the file extension need be changed
(.nc to .hdf).
Read a HDF-SDS dataset containing
MODIS Aqua Level-3 SSTs
. The file attributes contain the
geographical information and this is used to generate
coordinate variables. One issue is that the data "l3m_data"
are of type "unsigned short". These are not explicitly supported
through v5.1.1 (but will be in 5.2.0). Hence, a simple 'work-around'
In addition to plotting the original 9KM data, the
area_hi2lores is used to interpolate
the data to a 0.5x0.5 grid. A netCDF file is created.
Other 'L3' datasets could be directly used in the sample script.
For example: Example 3 on the
SeaWiFS Application page.
Read four MODIS HDF datasets and create a series
of swath contours over an Orthographic map. The 2D lat/lon data
is read off of each file and used to determine where on the map
to overlay the contours.
This example uses gsn_csm_contour_map to create the map plot
with the first set of contours, and then creates the remaining contour
plots with gsn_csm_contour. The
overlay procedure is then used to overlay these
remaining contour plots over the existing contour/map plot.
TMI Hydrometeor (cloud liquid water, prec. water, cloud ice, prec. ice) contains profiles in 14 layers at 5 km horizontal resolution, along with latent heat and surface rain, over a 760 km swath.
Specify a variable (here, "latentHeat") and plot (a) the entire swath; (b) region near India; (c) a vertical profile at locations where the latent heat
exceeds 1500. The file contains no units information for the variables.
A SEVIRI Level-3 water vapor data set. The variable is packed in a rather unusual
fashion. The flags should be viewed to determine the source of the data.
Read a HDF-EOS2
file containing swath
NCL identifies the swath as MODIS_SWATH_Type_L1B
Create a simple plot of reflectance with coordinates of scanline
The eos.hdf that appears as the file name is an alias for
MOOD021KM.A2000303.1920.002.2000317044659.hdf. It is not
uncommon for a HDF-EOS2 file to have the ".hdf" file extension.
In this case, NCL will open and read the file sucessfully but
it is best to manually append the ".hdfeos" extension when opening
the file in the addfile function.
file global attributes:
HDFEOSVersion : HDFEOS_V2.6
StructMetadata_0 : GROUP=SwathStructure
Example of a radiance plot. Note that the color table is reversed
from example 1.
A multiple contour plot of other quantities on the MODIS file.
MODIS data placed on a geographical projection.
A rather awkward aspect of this file is that the Latitude and Longitude
variables differ in size from the variable being plotted.
The 5 added to the map limits is arbitrary (not required). Here it is
used to specify
extra space around the plot.
Illustrates the use of dim_gbits
to extract bits. It also demonstrates explicitly
labeling different colors with a specific integers.
The use of res@trGridType =
"TriangularMesh" makes the plotting faster.
Read an AIRS Level-3 file (here, product type AIRX3STD). This uses the
AIR IR and MSU instruments. Although on a 1x1 degree grid, the
grid point values represent satellite swaths that have been
binned over a period of time (24 hours). The data to the left and right
of the Date Line represent
values that were sampled at different times. Hence, the gridded
values are not cyclic in longitude
Support for HDF5 and HDF-EOS5 is present in
v5.2.0 which was released April 14, 2010. Some samples follow.
Read a HDF-EOS5 (available v5.2.0)
file from the Aura
OMI (Ozone Monitoring Instrument) and plot all the variables
on the file. Here only two of the variables are shown.
Read an HDF-EOS5 (available v5.2.0)
file (OMI) and plot selected variables
on the file. (The
utility was used to preview the file's contents,)
This example also demonstrates how to retrieve a
variable's type prior to reading it into memory. (See
It is best to use
the variable type is "short" or "byte".
These functions will automatically apply the proper scaling.
Note that the units for the "EffectiveTemperature"
variable appear to be incorrect. They indicate "degrees Celsius"
but the range would indicate "degrees Kelvin". This could be addressed
by adding the following to the script after the variable has
if (vNam(nv).eq."EffectiveTemperature_ColumnAmountO3") then
x@units = "degrees Kelvin" ; fix bad units
Read an HDF-EOS5 (available v5.2.0)
file from the MLS (Aura Microwave Limb Sounder) and
plot a two-dimensional cross-section (pressure/time) of temperature. Then plot
the trajectory of the satellite over this time.
This uses a local library named HDFEOS_LIB.ncl.
It is available here.
Read an HDF-EOS5 (available v5.2.0)
the HIRDLS: (a) Use stat_dispersion
print the statistical information for each variable; (b) compute PDFs
(c) plot cross-sections; and, (d) plot time series of three different variables.
Read two similar OMI files (L3-OMDOAO3e and L3-OMTO3e).
This illustrates that users must
look at a file's contents before using. (Use
Here, there are two
files with similar variables but NCL assigns slightly different names.
IMPORTANT NOTE: The OMI L3 files used in this example have a bug
in the way the OMI data were written to the files. NCL (v5.2.1) correctly reads
and plots the data on the file. However, the variable data should be reversed
(flipped) to be corrctly associated with the coordinates.
The left image illustrates plotting of the original variables;
the right image illustrates the flipped (reordered) data.
NCL version 6.1.0 will be
able to identify OMI data files and will automatically correct for the
Read a variable and plot a user specified level. The grid is (180x360).
Missing values are present. A common issue with
HDF files is that they do not contain all the desired information
or units. Hence, they must be manually provided.
Reads "CO Total Column" across a time interval and a selected spatial
region, and plots it on a map using a range of colored markers.
This script was contributed by Rebecca Buchholz, a researcher in the
Atmospheric Chemistry Division at NCAR.
Read a variable and plot it according to a palette specified on the file.
The MSG (Meteosat Second Generation) file is HDF5.
The desired variables contain a dash and a space
which are not allowed in NCL variable names. Hence,
the variables are enclosed in quotes to make them
type string. For the ->
file syntax operator
to successfully access the string variables, they must be enclosed
within dollar sign ($
HDF5 (h5) files can be complicated. Users should become acquainted
with the file's contents prior to creating a script.
of the file would yield:
%> ncl_filedump K1VHR_15NOV2013_1200_L02_OLR.h5 | less
file global attributes:
DIM_000 = 250601
compound <OLR_Dataset> (Latitude, Longitude, OLR) (DIM_000)
GP_PARAM_DESCRIPTION : Every_Acquisition
GP_PARAM_NAME : Outgoing Longwave Radiation (OLR)
Input_Channels : TIR
LatInterval : 0.25
Latitude_Unit : Degrees
LonInterval : 0.25
Longitude_Unit : Degrees
MissingValueInProduct : 999
OLR_Unit : Watts/sq. met.
ValidBottomLat : -60
ValidLeftLon : 10
ValidRightLon : 140
ValidTopLat : 60
GROUND_STATION : BES,SAC/ISRO,Ahmedabad,INDIA.
HDF_PRODUCT_FILE_NAME : K1VHR_15NOV2013_1200_L02_OLR.h5
OUTPUT_FORMAT : hdf5-1.6.6
PRODUCT_CREATION_TIME : 2013-11-15T18:01:49 _L02_OLR.h5
STATION_ID : BES3-11-15T18:01:49 _L02_OLR.h5
UNIQUE_ID : K1VHR_15NOV2013_1200_L02_OLR.h5
ACQUISITION_DATE : 15NOV2013
ACQUISITION_TIME_IN_GMT : 1200V2013
PROCESSING_LEVEL : L020V2013
PROCESSING_SOFTWARE : InPGS_XXXXXXXXXXXXXX
PRODUCT_NAME : GP_OLR XXXXX
PRODUCT_TYPE : GEOPHY
SENSOR_ID : VHRPHY
SPACECRAFT_ID : KALPANA-1
Read a h5 (HDF5) file with 'group' fields. Note the syntax used by NCL.
The Latitude, Longitude and OLR are one-dimensional arrays of size 250601.
There are no general file conventions for HDF5 files. Users must
examine the file's contents and explicitly extract desired information.
The region of the globe is defined as attributes of the
group /OLR/GP_PARAM_INFO. These are used to create rectilinear
grid coordinates which are associated with the variable to
be plotted (olr) and written to a netCDF file.
FYI: It is not necessay to make a two-dimensional grid.
NCL can plot one-dimensional latitude/longitudinal/values
with appropriate graphical resource settings. However,
since a script option is to create a netCDF file. It
was designed to create the variable as a conventional
Read a SeaWIFS variable and use meta data contained within the file attributes
to construct a complete (ie, self contained) variable. The initial look at the
file was via the ncl_filedump
command line operator.
Data files were obtained
Additional SeaWIFS examples are available here.