Is there any method to write the big 3D (354 station * 14610 day * 23 variable) array out into a file?

From: Wen.J.Qu <wen.j.qu_at_nyahnyahspammersnyahnyah>
Date: Thu Sep 13 2012 - 09:51:03 MDT

Hi, Mary and Dennis,

I am wondering if it is any method to write the big 3D (354 station * 14610 day * 23 variable) array include all the data of my interest out into a file. Because I have worked on the entire array for a while (including find the incontinous days and fill them, as well as organize the data of all the years for all the stations together) and find it is quite convenient to work on the entire array to do further analysis.

Could you please suggest several more lines in the attahced code and help me to do that?

Thanks a lot.

Shawn




Wen.J.Qu
2012-09-13



发件人: Wen.J.Qu
发送时间: 2012-09-12 15:43:35
收件人: Mary Haley
抄送: ncl-talk; shea
主题: Re: Re: Fill the missing days and read data into one array - problem solved

Hi, Mary

I did it to read all the data into one big array, using the code attached.

But I finally realized that it may not be a smart idea to do that, it takes more than one hour to read into the data. So I think what I need is to find the appropriate file and read the data every time I need it.

Fortunately I have learned how to do this by emails from Dennis and you before.

Thanks a lot for your kind help.

Shawn




Wen.J.Qu
2012-09-12



发件人: Mary Haley
发送时间: 2012-09-11 21:05:59
收件人: Wen.J.Qu@gmail.com
抄送: ncl-talk
主题: Re: Fill the missing days and read data into one array - problem solved

You could use the new "reshape" function in 6.1.0-beta to reshape your 2D array to a 3D array. I believe all you need is:

  var2d_dims = dimsizes(var2d)
  var3d_dims = new(3,typeof(var2d_dims))

  var3d_dims(0) = 354
  var3d_dims(1) = var2d_dims(0)/354 ; the assumption is that var2d_dims(0) is evenly divisible by 354
  var3d_dims(2) = var2d_dims(1)

  var3d = reshape(var2d,var3d_dims)

If you don't have V6.1.0-beta, then you can use ndtooned and onedtond:

  var3d = onedtooned(ndtooned(var2d),var3d_dims))

--Mary

On Sep 9, 2012, at 5:28 PM, Wen.J.Qu wrote:

> Hi, Mary
>
> I solved the problem by tranpose the "data" array, which should be 365*23 or 366*23 to correctly use the array_append_record function, but I originally setted the dimension to 23*365 or 23*366 and leaded to the error. An update of the code is also attached.
>
> But I noticed that the current code will get a 2D array with the dimension of (354 station*40 year*365/366 day)*23 varible, which is not the ideal array I want. I will try to construct a 3D array with the dimension of 354 station*(40 year*365/366 day)*23 varible. I would work on this and try to struggle out.
>
> I will let you know my further progress. Thanks a lot.
>
> All my best!
>
> Wen.J.Qu
> 2012-09-09
> 发件人: Wen.J.Qu
> 发送时间: 2012-09-07 17:24:14
> 收件人: haley
> 抄送: ncl-talk
> 主题: Fill the missing days and read data into one array
>
> Hi, Mary,
>
> I am trying to construct code based on your script, to read the data of all the stations for all years into one array. I am firstly reading the data of one station for one year into array "data", then trying to array_append_record to expand the array to the data of one station for all years into array "data_new", thirdly trying to array_append_record to expand the array to the data of all the stations for all years into array "data_all".
>
> In doing this, I also considering that there are some entire months or entires years missing for some stations, so I change the code a little bit to deal with that.
>
> But when I run the code, it gets an error that "array_append_record: non-record dimensions must be the same size". I can not find where the problem is. Would you please help me with that? Thanks a lot in advance.
>
> The code is attached.
>
> All my best!
>
> Shawn
>
> Wen.J.Qu
> 2012-09-07
>
>
>
> ########################################################################################################
>
>
> load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"
>
> ;----------------------------------------------------------------------
> ; Function to create new yyyymmdd array based on existing
> ; yyyymmdd array that is missing some days.
> ;
> ; If the existing array goes from 19720423 to 19750813, then the
> ; new array will cover every day from 19720401 to 19750831.
> ;----------------------------------------------------------------------
> function create_new_yyyymmdd(year,month)
> local ntotal, ny, nm, ey, em, ict, nvals
> begin
> ;
> ; yearmoday has several days missing. We want to generate a new
> ; yearmoday array with no gaps.
> ;
> ;---Get largest size we need for new yearmoda array.
> nvals = dimsizes(year)
> ntotal = (year(nvals-1)-year(0)+1)*366
> yearmoda_new = new(ntotal,string)
>
> ny = year(0)
> nm = month(0)
> ey = year(nvals-1)
> em = month(nvals-1)
>
> ict = 0
> do while((ny.lt.year(nvals-1)).or.(ny.eq.ey.and.nm.le.em))
> dim = days_in_month(ny,nm)
> yearmoda_new(ict:ict+dim-1) = tostring((ny*10000)+(nm*100) + \
> ispan(1,dim,1))
>
> ict = ict + dim ; Increment yearmoda_new counter
>
> if(nm.lt.12) then ; Increment year and month as necessary
> nm = nm + 1
> else
> ny = ny+1
> nm = 1
> end if
> end do
>
> return(yearmoda_new(0:ict-1))
> end
>
> ;----------------------------------------------------------------------
> ; Create new (larger) array given the size, and a list of indexes
> ; to place the new values.
> ;----------------------------------------------------------------------
> function create_new_array(x,nsize,ii)
> begin
> if(isatt(x,"_FillValue")) then
> x_new = new(nsize,typeof(x),x@_FillValue)
> else
> x_new = new(nsize,typeof(x))
> end if
> x_new(ii) = x
> return(x_new)
> end
>
> ;----------------------------------------------------------------------
> ; Main code
> ;----------------------------------------------------------------------
>
> begin
>
> diri = "/work/jwang7/quwj/parallel/data/gsod/" ; data directory
>
> station_list = diri+"/station_list.txt" ; station_list contains station numbers of the station of interest
> STA = asciiread(station_list,-1,"integer")
>
> ;---Find the appropriate ASCII file to read in
>
>
> do sta_number_loop = 450000,600000,1 ; the station numbers of the station of interest are within this range
>
> if(any(STA.eq.sta_number_loop)) then
>
> sta_number = sta_number_loop
>
> ;Debug print
> printVarSummary(sta_number)
> print(sta_number)
>
> do year_loop = 1973,2012,1
>
> Year = year_loop
> fili = systemfunc ("cd "+diri+sta_number+"/" + " ; ls *")
> sta = toint( str_get_cols(fili, 0, 5) )
> YEAR = toint( str_get_cols(fili,13,16) )
>
> ;Debug print
> printVarSummary(Year)
> print(Year)
> printVarSummary(fili)
> print(fili)
> printVarSummary(sta)
> print(sta)
> printVarSummary(YEAR)
> print(YEAR)
>
> Index = ind(sta.eq.sta_number .and. YEAR.eq.Year)
>
>
> if(ismissing(Index)) then
>
> delete(fili)
> delete(sta)
> delete(YEAR)
> delete(Index)
>
> ;Creat a new array for the station that year with missing values
>
> nDayYear = 365
> if (isleapyear(Year)) then
> nDayYear = 366
> end if
>
> Yearmoda = yyyymmdd_time(Year,Year,"string")
> delete(Year)
>
> ;Break yearmoda into individual year, month and day
> year = toint(str_get_cols(Yearmoda,0,3))
> month = toint(str_get_cols(Yearmoda,4,5))
> day = toint(str_get_cols(Yearmoda,6,7))
> year@calendar = "standard"
>
> yearmoda_new = Yearmoda
> stn_new = new(nDayYear,"integer",999999)
> stn_new = sta_number
> wban_new = new(nDayYear,"integer",99999)
> temp_new = new(nDayYear,"float",9999.9)
> dewp_new = new(nDayYear,"float",9999.9)
> rh_new = new(nDayYear,"float",99.99)
> slp_new = new(nDayYear,"float",9999.9)
> stp_new = new(nDayYear,"float",9999.9)
> visib_new = new(nDayYear,"float",999.9)
> wdsp_new = new(nDayYear,"float",999.9)
> maxspd_new = new(nDayYear,"float",999.9)
> gust_new = new(nDayYear,"float",999.9)
> maxtemp_new = new(nDayYear,"float",9999.9)
> mintemp_new = new(nDayYear,"float",9999.9)
> prcp_new = new(nDayYear,"float",99.99)
> sndp_new = new(nDayYear,"float",999.9)
> frshtt_new = new(nDayYear,"string")
> fog_new = toint(str_get_cols(frshtt_new,0,0))
> rain_new = toint(str_get_cols(frshtt_new,1,1))
> snow_new = toint(str_get_cols(frshtt_new,2,2))
> hail_new = toint(str_get_cols(frshtt_new,3,3))
> thunder_new = toint(str_get_cols(frshtt_new,4,4))
> tornado_new = toint(str_get_cols(frshtt_new,5,5))
> yearmoda_new_int = toint(yearmoda_new)
> frshtt_new_int = toint(frshtt_new)
>
> ;Construct a new array include all the varibles
> data = (/stn_new,wban_new,yearmoda_new_int,temp_new,\
> dewp_new,rh_new,slp_new,stp_new,visib_new,\
> wdsp_new,maxspd_new,gust_new,maxtemp_new,\
> mintemp_new,prcp_new,sndp_new,frshtt_new_int,\
> fog_new,rain_new,snow_new,hail_new,thunder_new,\
> tornado_new/)
>
> ;---Release varible
> delete(year)
> delete(month)
> delete(day)
> delete(Yearmoda)
> delete(yearmoda_new)
> delete(yearmoda_new_int)
> delete(stn_new)
> delete(wban_new)
> delete(temp_new)
> delete(dewp_new)
> delete(rh_new)
> delete(slp_new)
> delete(stp_new)
> delete(visib_new)
> delete(wdsp_new)
> delete(maxspd_new)
> delete(gust_new)
> delete(maxtemp_new)
> delete(mintemp_new)
> delete(prcp_new)
> delete(sndp_new)
> delete(frshtt_new)
> delete(frshtt_new_int)
> delete(fog_new)
> delete(rain_new)
> delete(snow_new)
> delete(hail_new)
> delete(thunder_new)
> delete(tornado_new)
>
> else
>
> ;Debug print
> print (Index)
>
> ;Read the file's data
>
> ;Debug print
> printVarSummary(fili(Index))
> print(fili(Index))
>
> filename = diri+sta_number+"/"+fili(Index)
>
> ;Release varibles
> delete(fili)
> delete(sta)
> delete(YEAR)
> delete(Index)
>
> ;Creat a new wholes array for the station that year
>
> nDayYear = 365
> if (isleapyear(Year)) then
> nDayYear = 366
> end if
>
> Yearmoda = yyyymmdd_time(Year,Year,"string")
> delete(Year)
>
> ;Break yearmoda into individual year, month and day
> year = toint(str_get_cols(Yearmoda,0,3))
> month = toint(str_get_cols(Yearmoda,4,5))
> day = toint(str_get_cols(Yearmoda,6,7))
> year@calendar = "standard"
>
> ;---Read data file as array of strings
> lines = asciiread(filename,-1,"string")
> nlines = dimsizes(lines)
>
> ;
> ; Parse strings to get values. Don't use header line [line(0)]
> ; Note that fields 5, 7, 9, 11, 13, and 15 are skipped.
> ; They don't appear to have headers associated with them.
> ;
> stn = toint(str_get_field(lines(1:),1," "))
> wban = toint(str_get_field(lines(1:),2," "))
> yearmoda = str_get_field(lines(1:),3," ")
> temp = tofloat(str_get_field(lines(1:),4," "))
> dewp = tofloat(str_get_field(lines(1:),6," "))
> slp = tofloat(str_get_field(lines(1:),8," "))
> stp = tofloat(str_get_field(lines(1:),10," "))
> visib = tofloat(str_get_field(lines(1:),12," "))
> wdsp = tofloat(str_get_field(lines(1:),14," "))
> maxspd = tofloat(str_get_field(lines(1:),16," "))
> gust = tofloat(str_get_field(lines(1:),17," "))
> maxtemp = tofloat(str_get_field(lines(1:),18," "))
> mintemp = tofloat(str_get_field(lines(1:),19," "))
> prcp = tofloat(str_get_field(lines(1:),20," "))
> sndp = tofloat(str_get_field(lines(1:),21," "))
> frshtt = str_get_field(lines(1:),22," ")
>
> ; Convert temperature from Fahrenheit to Celsius.
> temp = (temp-32)*5/9
> ; Convert dew point temperature from Fahrenheit to Celsius.
> dewp = (dewp-32)*5/9
> ; Calculate relative humidity (%).
> rh = 100*(((112-0.1*temp+dewp)/(112+0.9*temp))^8)
> ; convert visib from mile to kilometer.
> visib = visib*1.60934398
> ; convert wdsp from knot to m/s.
> wdsp = wdsp*0.51444
> ; convert maxspd from knot to m/s.
> maxspd = maxspd*0.51444
> ; convert gust from knot to m/s.
> gust = gust*0.51444
> ; Convert maximum temperature from Fahrenheit to Celsius.
> maxtemp = (maxtemp-32)*5/9
> ; Convert minimum temperature from Fahrenheit to Celsius.
> mintemp = (mintemp-32)*5/9
> ; convert prcp from inch to millmeter.
> prcp = prcp*25.4
> ; convert sndp from inch to millmeter.
> sndp = sndp*25.4
>
> nvals = dimsizes(stn)
>
> ;---Set some missing values where appropriate
> stn@_FillValue = 999999
> wban@_FillValue = 99999
> temp@_FillValue = 9999.9
> dewp@_FillValue = 9999.9
> rh@_FillValue = 99.99
> slp@_FillValue = 9999.9
> stp@_FillValue = 9999.9
> visib@_FillValue = 999.9
> wdsp@_FillValue = 999.9
> maxspd@_FillValue = 999.9
> gust@_FillValue = 999.9
> maxtemp@_FillValue = 9999.9
> mintemp@_FillValue = 9999.9
> prcp@_FillValue = 99.99
> sndp@_FillValue = 999.9
>
> ;---Debug prints
> print("stn min/max = " + min(stn) + "/" + max(stn))
> print("wban min/max = " + min(wban) + "/" + max(wban))
> print("temp min/max = " + min(temp) + "/" + max(temp))
> print("dewp min/max = " + min(dewp) + "/" + max(dewp))
> print("rh min/max = " + min(rh) + "/" + max(rh))
> print("slp min/max = " + min(slp) + "/" + max(slp))
> print("stp min/max = " + min(stp) + "/" + max(stp))
> print("visib min/max = " + min(visib) + "/" + max(visib))
> print("wdsp min/max = " + min(wdsp) + "/" + max(wdsp))
> print("maxspd min/max = " + min(maxspd) + "/" + max(maxspd))
> print("gust min/max = " + min(gust) + "/" + max(gust))
> print("maxtemp min/max = " + min(maxtemp) + "/" + max(maxtemp))
> print("mintemp min/max = " + min(mintemp) + "/" + max(mintemp))
> print("prcp min/max = " + min(prcp) + "/" + max(prcp))
> print("sndp min/max = " + min(sndp) + "/" + max(sndp))
> printVarSummary(frshtt)
>
> ;---Create new array based on year and month arrays
> yearmoda_new = create_new_yyyymmdd(year,month)
> ntime = dimsizes(yearmoda_new)
>
> print("--------------------------------------------------")
> print("# of days in new arrays will be = " + ntime)
> print("--------------------------------------------------")
>
> ;--Get indices where date value is not missing.
> ii = new(nvals,integer)
> do i=0,nvals-1
> ii(i) = ind(yearmoda(i).eq.yearmoda_new)
> end do
>
> ;---Use indexes to place real values in location of new arrays
> stn_new = create_new_array(stn,ntime,ii)
> wban_new = create_new_array(wban,ntime,ii)
> temp_new = create_new_array(temp,ntime,ii)
> dewp_new = create_new_array(dewp,ntime,ii)
> rh_new = create_new_array(rh,ntime,ii)
> slp_new = create_new_array(slp,ntime,ii)
> stp_new = create_new_array(stp,ntime,ii)
> visib_new = create_new_array(visib,ntime,ii)
> wdsp_new = create_new_array(wdsp,ntime,ii)
> maxspd_new = create_new_array(maxspd,ntime,ii)
> gust_new = create_new_array(gust,ntime,ii)
> maxtemp_new = create_new_array(maxtemp,ntime,ii)
> mintemp_new = create_new_array(mintemp,ntime,ii)
> prcp_new = create_new_array(prcp,ntime,ii)
> sndp_new = create_new_array(sndp,ntime,ii)
> frshtt_new = create_new_array(frshtt,ntime,ii)
> fog_new = toint(str_get_cols(frshtt_new,0,0))
> rain_new = toint(str_get_cols(frshtt_new,1,1))
> snow_new = toint(str_get_cols(frshtt_new,2,2))
> hail_new = toint(str_get_cols(frshtt_new,3,3))
> thunder_new = toint(str_get_cols(frshtt_new,4,4))
> tornado_new = toint(str_get_cols(frshtt_new,5,5))
> printVarSummary(yearmoda_new)
> yearmoda_new_int = toint(yearmoda_new)
> printVarSummary(yearmoda_new_int)
> frshtt_new_int = toint(frshtt_new)
>
> ;Construct a new array include all the varibles
> ;Debug print
> printVarSummary(stn_new)
> printVarSummary(wban_new)
> printVarSummary(yearmoda_new_int)
> printVarSummary(temp_new)
> printVarSummary(dewp_new)
> printVarSummary(rh_new)
> printVarSummary(slp_new)
> printVarSummary(stp_new)
> printVarSummary(visib_new)
> printVarSummary(wdsp_new)
> printVarSummary(maxspd_new)
> printVarSummary(gust_new)
> printVarSummary(maxtemp_new)
> printVarSummary(mintemp_new)
> printVarSummary(prcp_new)
> printVarSummary(sndp_new)
> printVarSummary(frshtt_new_int)
> printVarSummary(fog_new)
> printVarSummary(rain_new)
> printVarSummary(snow_new)
> printVarSummary(hail_new)
> printVarSummary(thunder_new)
> printVarSummary(tornado_new)
> data = (/stn_new,wban_new,yearmoda_new_int,temp_new,\
> dewp_new,rh_new,slp_new,stp_new,visib_new,\
> wdsp_new,maxspd_new,gust_new,maxtemp_new,\
> mintemp_new,prcp_new,sndp_new,frshtt_new_int,\
> fog_new,rain_new,snow_new,hail_new,thunder_new,\
> tornado_new/)
>
> ;---Debug prints
> print("stn_new min/max = " + min(stn_new) + "/" + max(stn_new))
> print("wban_new min/max = " + min(wban_new) + "/" + max(wban_new))
> print("temp_new min/max = " + min(temp_new) + "/" + max(temp_new))
> print("dewp_new min/max = " + min(dewp_new) + "/" + max(dewp_new))
> print("rh_new min/max = " + min(rh_new) + "/" + max(rh_new))
> print("slp_new min/max = " + min(slp_new) + "/" + max(slp_new))
> print("stp_new min/max = " + min(stp_new) + "/" + max(stp_new))
> print("visib_new min/max = " + min(visib_new) + "/" + max(visib_new))
> print("wdsp_new min/max = " + min(wdsp_new) + "/" + max(wdsp_new))
> print("maxspd_new min/max = " + min(maxspd_new) + "/" + max(maxspd_new))
> print("gust_new min/max = " + min(gust_new) + "/" + max(gust_new))
> print("maxtemp_new min/max = " + min(maxtemp_new) + "/" + max(maxtemp_new))
> print("mintemp_new min/max = " + min(mintemp_new) + "/" + max(mintemp_new))
> print("prcp_new min/max = " + min(prcp_new) + "/" + max(prcp_new))
> print("sndp_new min/max = " + min(sndp_new) + "/" + max(sndp_new))
> printVarSummary(frshtt_new)
> printVarSummary(data)
>
> ;---Release varible
> delete(i)
> delete(ii)
> delete(year)
> delete(month)
> delete(day)
> delete(lines)
> delete(nlines)
> delete(stn)
> delete(wban)
> delete(yearmoda)
> delete(Yearmoda)
> delete(temp)
> delete(dewp)
> delete(rh)
> delete(slp)
> delete(stp)
> delete(visib)
> delete(wdsp)
> delete(maxspd)
> delete(gust)
> delete(maxtemp)
> delete(mintemp)
> delete(prcp)
> delete(sndp)
> delete(frshtt)
> delete(yearmoda_new)
> delete(yearmoda_new_int)
> delete(stn_new)
> delete(wban_new)
> delete(temp_new)
> delete(dewp_new)
> delete(rh_new)
> delete(slp_new)
> delete(stp_new)
> delete(visib_new)
> delete(wdsp_new)
> delete(maxspd_new)
> delete(gust_new)
> delete(maxtemp_new)
> delete(mintemp_new)
> delete(prcp_new)
> delete(sndp_new)
> delete(frshtt_new)
> delete(frshtt_new_int)
> delete(fog_new)
> delete(rain_new)
> delete(snow_new)
> delete(hail_new)
> delete(thunder_new)
> delete(tornado_new)
>
> end if
>
> ;Construct a new array for one station all the years
> data_1 = data
> ;Release varible
> delete(data)
>
> if(year_loop.le.1973) then
>
> data_new = data_1
> ;Release varible
> delete(data_1)
>
> else
>
> data_2 = data_new
> ;Release varible "data_new"
> delete(data_new)
> data_new=array_append_record(data_1,data_2,0)
> ;Release varibles
> delete(data_1)
> delete(data_2)
>
> end if
>
> ;Debug print
> printVarSummary(data_new)
>
> end do
>
> ;Construct an entire array for all the stations all the years
> data_all_1 = data_new
> ;Release varible
> delete(data_new)
>
> if(sta_number_loop.le.450070) then
>
> data_all = data_all_1
> ;Release varible
> delete(data_all_1)
>
> else
>
> data_all_2 = data_all
> ;Release varible "data_all"
> delete(data_all)
> data_all=array_append_record(data_all_1,data_all_2,0)
> ;Release varibles
> delete(data_all_1)
> delete(data_all_2)
>
> end if
>
> ;Debug print
> printVarSummary(data_all)
>
> end if
>
> end do
>
> end
> <fill_incontinous_read.ncl >

_______________________________________________
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
http://mailman.ucar.edu/mailman/listinfo/ncl-talk

Received on Thu Sep 13 09:51:29 2012

This archive was generated by hypermail 2.1.8 : Fri Sep 21 2012 - 16:22:30 MDT