Better than free: Data explorations with public data and software tools

2016 SEA Conference Tutorial


Wednesday, April 6, 2016

This is a help page for students attending this one-day tutorial. The focus will be on:

  • Downloading data from NCAR's Research Data Archive (RDA)
  • Using free tools to examine, read, analyze, and visualize data

Useful links:



Useful RDA BLOG articles



Free Tools

Scripting languages
NCL Tailored for climate sciences; File I/O, computations, 2D visualizations
Python Popular programming language spanning numerous disciplines
GUI-based quick look tools
ncview Quick look at NetCDF data
Panoply Quick look at NetCDF (and other formats)
Python tools
ESMPy Python interface to ESMF regridding utility
Integrated Ocean Observing System (IOOS) Collection of software to support IOOS
Iris Data abstraction library for meteorology and climatology
matplotlib 2D plotting library
numpy Fundamental package for scientific computing with Python
pandas Data analysis tool
PyAverager Parallel tool for computing averages from climate model output
PyNGL 2D plotting library (based on NCL's graphical library)
PyNIO Read/write access to a variety of data formats (based on NCL's file I/O library)
PyReshaper Parallel tool for converting NetCDF files from time-slice (history) format to time-series (single-variable) format.
xarray Extension of pandas for multi-dimensional arrays
Command line tools
convert Manipulate graphical images for web, powerpoint, presentations, etc.
(part of ImageMagick tool suite)
NetCDF Operators (NCO) Operators to manipulate and analyze (mainly) NetCDF data
Climate Data Operators (CDO) Operators to manipulate and analyze climate and NWP model data
wgrib Manipulate, inventory and decode GRIB files
wgrib2 "Four drawers of kitchen utensils as well as the microwave and blender"
CF Compliance Checkers
cf-checker.pl A perl-based CF compliance checker
Python platforms
Anaconda Data science platform for python community, gives you lots of python packages already built (NumPy, SciPy, Pandas, Jupyter, etc)
Includes "conda" package installer
miniconda "Mini" Anaconda (gives you python and conda)
Jupyter Notebook (used to be called IPython Notebook) A web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Works with Python, R, and other languages.



Sample NCL and Python scripts to use

Most of the datasets below were downloaded from the RDA, and we've included their identifier, "dsnnn.n".
You can search for the datasets in this group by typing the "nnn.n" identifier in the RDA's "Go to Dataset" search window.

ds604.0 - cfdda_1985013123.v2.nc
rectilinear, 450 x 900 (lat x lon)
cfdda_plot_var.ncl plots given variable using NCL
cfdda_plot_var_mpl.py plots given variable using matplotlib
cfdda_plot_var_ngl.py plots given variable using PyNGL
ds083.2 - fnl_20070703_06_00.grib1
rectilinear, 181 x 360 (lat x lon)
fnl_grib1_plot_var.ncl plots given variable using NCL
plot_all_ngl.py plots all variables on the file using PyNGL (needs create_html.py)
ds083.3 - gdas1.fnl0p25.2016030100.f00.grib2
rectilinear, 720 x 1440 (lat x lon)
gdas1_fnl_grib2_plot_var.ncl plots the given variable on the file using NCL
plot_all_ngl.py plots all variables on the file using PyNGL (needs create_html.py)
ds728.3 - gpcp_1dd_v1.2_p1d.201309.nc
rectilinear, 30 x 180 x 360 (time x lat x lon)
gpcp_plot_precip.ncl plots precipitation using NCL (global map)
gpcp_plot_precip_subset.ncl plots precipitation over the United States
gpcp_xarray_example.py Uses "xarray" to compute mean across time, plots with matplotlib
ds728.2
Using data read off THREDDS data server
gpcp_opendap_mpl.py Read file from OPeNDAP server and plot precipitation using matplotlib
gpcp_opendap_plot_precip.ncl Read file from OPeNDAP server (that Tom demoed) and plot precipitation using NCL
ds510.0 - gdas1.t18z.sfluxgrbf03.grib2
rectilinear, 880 x 1760 (lat x lon)
grib2_plot_var.ncl plot the given variable using NCL (needs create_html.ncl)
grib2_plot_all_mpl.py plot all 2D variables on a GRIB2 file using matplotlib (needs create_html.py)
grib2_plot_var_mpl.py plot the given variable using matplotlib
grib2_plot_var_ngl.py plot the given variable using PyNGL
ds463.4 - hadisd.1.0.2.2013f.724957-99999.nc
296096 time steps
hadley_timeseries_plot.ncl Creates a timeseries plot of temperature using NCL
hadley_timeseries_plot_subset.ncl Same, but with a subset of the time range
ruc.grb
(old file, not from RDA), curvilinear, 113 x 151 (lat x lon)
grib1_plot_var_mpl.py plot the given 2D variable using matplotlib
grib1_plot_var_ngl.py plot the given 2D variable using PyNGL
grib1_plot_all_ngl.py plot all 2D variables and create a simple HTML file with sample plots
grib1_plot_all_mpl.py plot all 2D variables and create a simple HTML file with sample plots (needs create_html.py)
grib1_plot_uv_mpl.py vector plot using matplotlib / basemap
grib1_plot_uv_ngl.py vector plot using PyNGL
grib1_plot_uv.ncl vector plot using NCL
MET9_IR108_cosmode_0909210000.grb2
Sample GRIB2 (not from RDA), rotated grid, curvilinear, 461 x 421 (lat x lon)
grib2_plot_rotated_ngl.py plots data in rotated projection using PyNGL
grib2_plot_unrotated_ngl.py plots non-rotated data using PyNGL
grib2_plot_unrotated_mpl.py plots non-rotated data using matplotlib
Jupyter (IPython) Notebooks
To run a notebook, type "ipython notebook" or "jupyter notebook" on the UNIX command line. A jupyter browser window should come up with a list of your *.ipynb files. Select one of these files to start working in your notebook.
fillplot_demo.ipynb A sample notebook that creates some XY plots using dummy. This is a good demo for playing with graphics inside a notebook.
pynio_demo_examine_files.ipynb A notebook for opening various files (NetCDF, GRIB) and examining them using PyNIO.
pynio_demo_query_file_contents.ipynb A notebook for opening various files (NetCDF, GRIB) and querying about the file contents.
pynio_demo_read_vars.ipynb A notebook for reading variables off various files (NetCDF, GRIB)



NCL and Python tools on yellowstone

If you have a yellowstone account:

ssh -Y yellowstone.ucar.edu
execgy
module load gnu/4.8.2
module load ncl
module load python/2.7.7
module load all-python-libs

For the latest PyNIO and PyNGL:

setenv PYTHONPATH /glade/apps/opt/PyNGL/1.5.0/gnu/4.8.2/PyNGL:/glade/apps/opt/PyNIO/1.5.0/gnu/4.8.3/PyNIO:$PYTHONPATH

or

export PYTHONPATH=/glade/apps/opt/PyNGL/1.5.0/gnu/4.8.2/PyNGL:/glade/apps/opt/PyNIO/1.5.0/gnu/4.8.3/PyNIO:$PYTHONPATH



Software preparation for tutorial

  • Sign up for an account on NCAR's Research Data Archive (RDA) by clicking on the "Register Now" link at the top of the page. This can take a few hours before the registration is active.

  • Install Anaconda - This will give you a scientific python environment and the "conda" command for installing other software.

    Go to http://www.continuum.io/downloads and select the option appropriate for your system. Choose the Python 2.7 option (and not Python 3.x).

  • Install NCL, PyNIO, PyNGL, and other tools using "conda".

    This requires that step #2 be complete.

    First open a terminal window where you can type UNIX commands, and type:

    conda create --name ncl_test --channel dbrown --channel khallock ncl pyngl pynio matplotlib basemap ipython jupyter
    

    This creates a conda environment called "ncl_test" in which you can use during the tutorial and subsequently installs a bunch of software.

    To activate the "ncl_test" environment, type (from bash/sh):

    source activate ncl_test
    

    You can test NCL and matplotlib quickly by typing:

    python -c "from mpl_toolkits.basemap import Basemap"
    ncl -V ; ng4ex xy01n -clean
    

    The first command will echo something like "r16461", and the second command should pop up a window with an XY plot. Click on it with your left mouse button to make it go away.

    To test that you can run a jupyter notebook, type:

    jupyter notebook
    

  • Optional: install wgrib and wgrib2.

  • Install panoply

  • Optional: install ncview

  • Optional: install netcdf4python

    conda install netcdf4