NCL Home> Application examples> Data sets || Data files for some examples

Example pages containing: tips | resources | functions/procedures

Shapefiles

Shapefiles

The capability to read shapefiles has been added to NCL version 5.1.1.

From Wikipedia:

The ESRI Shapefile or simply a shapefile is a popular geospatial vector data format for geographic information systems software. It is developed and regulated by ESRI as a (mostly) open specification for data interoperability among ESRI and other software products.[1] A "shapefile" commonly refers to a collection of files with ".shp", ".shx", ".prj", ".dbf", and other extensions on a common prefix name (e.g., "lakes.*"). The actual shapefile relates specifically to files with the ".shp" extension, however this file alone is incomplete for distribution, as these other supporting files are required.

Shapefiles describe a homogeneous set of geometrical features comprised of either points, polylines, or polygons. These, for example, could represent water wells, rivers, and lakes, respectively. Each feature may also have non-spatial attributes that describe it, such as the name or temperature.

Within NCL, a shapefile appears as a collection of 4 or 5 specifically named variables that encode the geometry of the features, along with some number of non-spatial variables. The number, names and types of the non-spatial variables depend upon the specific shapefile. The geometry of a feature is composed of one or more segments. Segments in turn are composed of an ordered-list of X, Y (and optionally Z) tuples. NCL uses the following variables to encode these relationships:

    geometry(num_features, 2)
    segments(num_segments, 2)
    x(num_points)
    y(num_points)
    z(num_points)    ; 3D datasets only

Each feature in the shapefile is represented by an entry in the geometry variable, along with corresponding entries in the non-spatial variables; i.e., the data for the i-th feature is found at the i-th entry of these variables. Each entry in geometry has two values: geometry(i, 0) specifies the index into variable segments of the 1st segment of the i-th feature, and geometry(i, 1) denotes the number of segments that make up the feature.

Similarly, each entry in segments has two values: segments(j, 0) is the index of the 1st XY(Z) tuple of the j-th segment, and segments(j, 1) is the number of tuples belonging to the segment. With this encoding, any subsequent segments belonging to the i-th feature follow its first one in segments, and all the XY(Z) tuples belonging to the j-th segment follow the first in the x,y,(z) variables.

NCL defines several global attributes for a shapefile:

    geom_segIndex = 0
    geom_numSegs  = 1
    segs_xyzIndex = 0
    segs_numPnts  = 1
    geometry_type = "point" | "polyline" | "polygon"
    layer_name    =    ; value derived from the shapefile

The first 4 attributes are intended as symbolic indices into the geometry and segments variable; see the examples below for how they should be used.

The Global Administrative Database (GADM) (http://www.gadm.org) offers consistent administrative boundaries at 3 levels. The level 0 database (nations) is good to use for global or mesoscale results, level 1 is the first level of sub-national administration (typically states/provinces and territories) while level 2 offers the second level of administration and is potentially useful for high-resolution plots. The global shapefiles are large but it's possible to download individual countries separately.

NOAA provides some useful AWIPS shapefile data.

shapefiles_1.ncl: Demonstrates how to read a shapefile and draw filled polygons over a map. Non-spatial variables from the associated database (i.e., the states.dbf file) are used to compute the polygon colors.

To run this example, you must have the files "states.shp", "states.dbf", and "states.shx". These can be obtained from the National Atlas.

shapefiles_2.ncl: Demonstrates how to read a shapefile and draw selected information based upon a database query. In this case, the historical incidents of F5-class tornadoes in the USA are plotted.

To run this example, you must have the "states" shapefiles from the previous example, along with "tornadx020.shp", "tornadx020.dbf", and "tornadx020.shx". These can be obtained from the National Atlas.

shapefiles_3.ncl: Demonstrates how to read a shapefile and draw line segments on a map. These line segments represent stream data in South America.

To run this example, you must download the "HYDRO1k Streams data set for South America as a tar file in shapefile format" (2.4 MB) from:
http://eros.usgs.gov/#/Find_Data/Products_and_Data_Available/gtopo30/hydro/samerica

shapefiles_4.ncl: Demonstrates using gc_inout to mask an area in your data array using a geographical outline.

This particular example reads a shapefile to get an outline of the Mississippi River Basin. You then have the option of masking out all areas inside or outside this outline.

The "mrb.xxx" data files for this example can be found on the example datasets page.

shapefiles_5.ncl: Makes use of several shapefiles of differing resolutions and contents to mask data along county borders (Pakistan), and to draw and label selected boundaries and cities. Demonstrates querying the shapefiles' databases via non-spatial attributes to extract and draw specific geometry.

Also provides an example of using table to create a custom map legend.

The shapefiles for this example were obtained from DIVA-GIS. Search for administrative boundaries of Pakistan and download. The resultant zipfile contains four sets of shapefile files.

shapefiles_6.ncl: This example uses a script very similar to example 3 above for South America to draw the canton outlines for Switzerland. The first frame shows the default map outline for Switzerland (admittedly not very good), and the second frame shows the data from the shapefile.

The point is to show that shapefiles tend to have similar formats, and hence you can take a script and easily modify it to draw the outlines you're interested in.

In this example, the outlines are drawn with polylines, and the places of interest with text strings and polymarkers.

Note that if you try to plot a lot of individual line segments or text strings using this code, then you may want to consider using gsn_polyline instead of gsn_add_polyline, gsn_polymarker instead of gsn_add_polymarker, and gsn_text instead of gsn_add_text.

shapefiles_7.ncl: This script uses several shapefiles to draw river basins, points of interest, and indigenous areas in Australia. The shapefiles were downloaded from several locations. See the comments in the shapefiles_7.ncl script for details.
shapefiles_7_new.ncl: This script is almost identical to the previous "shapefiles_7.ncl", except it uses new functions added to NCL V6.1.0 to attach the shapefile data: gsn_add_shapefile_polygons, gsn_add_shapefile_polylines, and gsn_add_shapefile_polymarkers.
shapefiles_8.ncl: This script uses four shapefiles downloaded from http://www.gadm.org/country to draw various administrative areas for India.
shapefiles_8_new.ncl: This script is almost identical to the previous "shapefiles_8.ncl", except it uses a new function added to NCL V6.1.0 to attach the polylines: gsn_add_shapefile_polylines.
shapefiles_9.ncl: This uses the shapefile Example 4. This demonstrates calculating an areally weighted mean time series for an irregularly shaped region. As in Example 4, an array containing only the desired locations inside the shapefile is created. All other grid points are set to _FillValue. Specifially, this computes the areal mean time series of monthly precipitation for the Mississippi River Basin. The data is the monthly GPCP.
mask_12.ncl: This example shows how to use a shapefile that contains polygon outlines to create a data mask for a variable with 1D coordinate arrays. The mask array is then written to a copy of the input file.

In this case, the shapefile contains coastal outlines, which a land mask is created from. See the function "create_mask_from_shapefile" in the "mask_12.ncl" script. This function only works for data that contains coordinate arrays. You will need to modify it to work with curvilinear or unstructured data.

You should be able to use any shapefile that contains polygon data (point and polyline data won't work) to create the desired mask.

The shapefile used in this example was part of a compressed file, "GSHHS_shp_2.2.0.zip", downloaded from:

http://www.ngdc.noaa.gov/mgg/shorelines/data/gshhs/version2.2.0

You need to uncompress it with the "unzip" command. You can use any of the other shapefiles that are included with this file, but they are potentially a higher resolution, and hence creating the mask will take longer.

shapefiles_10.ncl: This example shows how to plot polygon data from a shapefile containing geologic units and structural features in Colorado. The shapefile was downloaded from http://mrdata.usgs.gov/geology/state/state.php?state=CO.

The shapefile data came with a suggested lithologic color map: http://mrdata.usgs.gov/catalog/lithclass-color.php, which shows which colors to use for which rock type. This example draws the lithologic color map on a separate frame, using several labelbars.

In order to use the suggested color map, the gsn_add_shapefile_polygons procedure was copied from the $NCARG_ROOT/lib/ncarg/nclscripts/csm directory, and then modified as needed.

shapefiles_11.ncl: This example shows how to use a shapefile that contains an outline of the United States to create a land mask. The land mask is written to a NetCDF file so you can use it later for masking other variables on the same grid.

Two USA shapefiles were tried with this example. The "gz_2010_us_020_00_5m.shp" shapefile is from http://www.census.gov/geo/www/cob/ and the "nationalp010g.shp" shapefile is from http://www.nationalatlas.gov/atlasftp-1m.html#nationp. The census one is smaller and hence the script runs much faster on this one (200+ wall seconds versus 17 wall clock seconds). Which file you use depends on how fine your original grid is, and how good of a mask you need.

shapefiles_12.ncl / shapefiles_12b.ncl: This example shows how to modify gsn_add_shapefile_polylines to have it draw only a subset of the features in the given shapefile.

See references to "feature_names", "vname", and "vlist" in the "gsn_add_shapefile_polylines_subset" function at the top of the NCL script.

The shapefile contains Interstate Highways, and only the primary interstate highways (I-5, I-82, and I-90) in Western United States are drawn. (Special thanks to Dave Allured for his improvement of this script to correctly plot all highway segments with complex entries, like "I- 5, US 30".)

The shapefile was downloaded from http://www.nws.noaa.gov/geodata/catalog/transportation/html/interst.htm.