[newbie] vectorized way to write data variable to CSV?

From: Tom Roche <Tom_Roche_at_nyahnyahspammersnyahnyah>
Date: Tue Feb 05 2013 - 12:08:03 MST

What's the most performant way to write both dimension and data values
of a datavar to CSV? Why I ask:

I have a large (> 1 GB) netCDF file. I'm only interested in one
datavar in the file, so I used NCL to "prune" it down to one datavar,
and write that "pruned output" to a new netCDF file: see


The pruned datavar is much more tractable: 13 MB, 3.3 Mtuples, see


However I want to further operate on that data in R. While I could
work directly with the netCDF, I find CSV easier to pull into R ...
but my current code for doing that (~100 lines starting @ line 219 of
prune_IOAPI.ncl) is

* very unvectorized: it's a for-loop writing tuples to a 1D vector of
  (text) lines. The tuples are values of the pruned datavar
  sg_datavar_grid(t,l,r,c), and each line has the form

sprinti("%i", r) + "," + \
sprinti("%i", c) + "," + \
sprinti("%i", l) + "," + \
sprinti("%i", t) + "," + \
sprintf("%f", sg_datavar_grid(t,l,r,c))

* not very performant: on an HPC cluster node (admittedly running in
  home, not as a job) writing the 3.3 Mtuples (66 MB) to a string

lines = new( (/ 3293784 /), string)

  (i.e., no file I/O, just in-memory) requires 19.22 min.


So I suspect my code can be improved. Is there a better way to do this?

TIA, Tom Roche <Tom_Roche@pobox.com>
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
Received on Tue Feb 5 12:08:30 2013

This archive was generated by hypermail 2.1.8 : Wed Feb 13 2013 - 09:25:58 MST