Re: How to check and fill the missing lines with the missing values?

From: Mary Haley <haley_at_nyahnyahspammersnyahnyah>
Date: Fri Aug 24 2012 - 18:30:25 MDT

Shawn,

See the attached file.

I pretty much use str_get_field (and not str_get_cols) to read in all your data. I do use "str_get_cols" to parse the yyyymmdd array into yyyy and mm.

This function also creates a new yyyymmdd array that doesn't contain any missing days, and then creates 15 new arrays (for your 15 other fields) that are the same size, with the appropriate indexes filled in.

-Mary

On Aug 24, 2012, at 10:12 AM, Wen.J.Qu wrote:

> Hi, Mary,
>
> Sorry that I may not make clear the question. What I mean is that the time series is not continous, there are some days without records. So I want to find these days and add missing values for the variables on these days.
>
> Attached is an example of the file, which is data for one month in January. The file shoud have 31 lines (days) of records, but due to the days without records, we just get 12 days (lines). So could you please give me some suggetion about how to find these missing days and insert lines for these days with missing values?
>
> Thanks a lot.
>
> Shawn
>
> Wen.J.Qu
> 2012-08-24
> 发件人: Mary Haley
> 发送时间: 2012-08-24 10:14:40
> 收件人: Wen.J.Qu@gmail.com
> 抄送:
> 主题: Re: [ncl-talk] How to check and fill the missing lines with the missing values?
>
> Hi,
>
> This is offline for a bit.
>
> The example you showed below doesn't have any omitted fields, so I'm not sure what you mean by looking for missing values.
>
> Or, do you mean that every field has a value, but sometimes the value is 99999 or 9999.9, which indicates a missing value?
>
> Can you send me the whole file, if it's not too big?
>
> --Mary
>
> On Aug 24, 2012, at 8:57 AM, Wen.J.Qu wrote:
>
>> Hi, Mary
>>
>> Thanks a lot for your help. Yes, I am reading a lot of ascii files. Below is an example of the file, there is no "..." in the file, and the missing days are just omitted, without blank lines.
>>
>> I want to find these missing days and fill the lines with missing values for the varibles. Could you please give me some suggetions about this? Thanks a lot.
>>
>> STN--- WBAN YEARMODA TEMP DEWP SLP STP VISIB WDSP MXSPD GUST MAX MIN PRCP SNDP FRSHTT
>> 106160 99999 19730101 27.1 24 13.4 24 9999.9 0 965.9 4 6.2 24 2.7 24 6.0 999.9 35.6* 19.4* 0.00I 999.9 000000
>> 106160 99999 19730102 27.9 24 16.4 24 9999.9 0 9999.9 0 6.0 24 2.2 24 8.9 999.9 35.6* 23.0* 0.00I 999.9 000000
>> 106160 99999 19730107 30.4 24 28.4 24 9999.9 0 974.4 4 2.4 24 4.5 24 8.0 999.9 37.4* 26.6* 99.99 999.9 110000
>> 106160 99999 19730108 28.5 24 27.4 24 9999.9 0 973.6 7 1.9 24 4.8 24 8.0 999.9 30.2* 28.4* 99.99 999.9 110000
>> 106160 99999 19730109 30.4 24 29.4 24 9999.9 0 971.1 7 0.4 24 1.9 24 6.0 999.9 32.0* 28.4* 99.99 999.9 111000
>> 106160 99999 19730117 29.7 24 27.9 24 9999.9 0 946.9 7 1.8 24 6.7 24 13.0 999.9 30.2* 28.4* 0.00I 999.9 100000
>> 106160 99999 19730118 28.4 24 26.4 24 9999.9 0 9999.9 0 2.5 24 3.7 24 7.0 13.0 9999.9 9999.9 99.99 999.9 111000
>> 106160 99999 19730119 27.7 24 25.3 24 9999.9 0 9999.9 0 1.4 24 1.9 24 3.9 999.9 28.4* 26.6* 99.99 999.9 111000
>> 106160 99999 19730120 30.9 24 29.4 24 9999.9 0 947.4 7 2.1 24 6.0 24 12.0 999.9 35.6* 28.4* 99.99 999.9 111000
>> 106160 99999 19730201 31.0 24 27.8 24 9999.9 0 9999.9 0 5.6 24 1.8 24 4.1 999.9 35.6* 28.4* 0.00I 999.9 100000
>> 106160 99999 19730202 30.9 24 28.9 24 9999.9 0 967.3 6 2.6 24 3.6 24 8.0 999.9 32.0* 28.4* 0.00I 999.9 100000
>>
>>
>> Following is the scritpt I used to read the file.
>>
>> ;Read data into a big 1D string array
>> fname = "data/gsod/1973/726055-99999-1973.op"
>> data = asciiread(fname,-1,"string")
>>
>> ; Count the number of fields, just to show it can be done.
>> nfields = str_fields_count(data(0)," ")
>> print("number of fields = " + nfields)
>>
>> ;
>> ; Skip first row of "data" because it's just a header line.
>> ;
>> ; Use a space (" ") as a delimiter in str_get_field. The first
>> ; field is field=1 (unlike str_get_cols, in which the first column
>> ; is column=0).
>> ;
>> stn = stringtoint(str_get_field(data(1::), 1," "))
>> wban = stringtoint(str_get_field(data(1::), 2," "))
>>
>> yearmoda = stringtoint(str_get_field(data(1::), 3," "))
>> year = stringtoint(str_get_cols(data(1::),14,17))
>> month = stringtoint(str_get_cols(data(1::),18,19))
>> day = stringtoint(str_get_cols(data(1::),20,21))
>>
>> temp = stringtofloat(str_get_field(data(1::), 4," "))
>> ; Convert temperature from Fahrenheit to Celsius.
>> temp = (temp-32)*5/9
>> dewp = stringtofloat(str_get_field(data(1::), 6," "))
>> ; Convert dew point temperature from Fahrenheit to Celsius.
>> dewp = (dewp-32)*5/9
>>
>> ; Calculate relative humidity (%).
>> rh = 100*(((112-0.1*temp+dewp)/(112+0.9*temp))^8)
>>
>> slp = stringtofloat(str_get_field(data(1::), 8," "))
>> stp = stringtofloat(str_get_field(data(1::), 10," "))
>>
>> visib = stringtofloat(str_get_field(data(1::), 12," "))
>>
>> wdsp = stringtofloat(str_get_field(data(1::), 14," "))
>> maxspd = stringtofloat(str_get_field(data(1::), 16," "))
>> gust = stringtofloat(str_get_field(data(1::), 17," "))
>>
>> maxtemp = stringtofloat(str_get_field(data(1::), 18," "))
>> ; Convert maximum temperature from Fahrenheit to Celsius.
>> maxtemp = (maxtemp-32)*5/9
>> mintemp = stringtofloat(str_get_field(data(1::), 19," "))
>> ; Convert minimum temperature from Fahrenheit to Celsius.
>> mintemp = (mintemp-32)*5/9
>>
>> prcp = stringtofloat(str_get_field(data(1::), 20," "))
>> sndp = stringtofloat(str_get_field(data(1::), 21," "))
>>
>> frshtt = stringtoint(str_get_field(data(1::), 22," "))
>> fog = stringtoint(str_get_cols(data(1::),132,132))
>> rain = stringtoint(str_get_cols(data(1::),133,133))
>> snow = stringtoint(str_get_cols(data(1::),134,134))
>> hail = stringtoint(str_get_cols(data(1::),135,135))
>> thunder = stringtoint(str_get_cols(data(1::),136,136))
>> tornado = stringtoint(str_get_cols(data(1::),137,137))
>>
>> print(rh)
>> print(maxtemp)
>> print(frshtt)
>> print(yearmoda)
>> print(tornado)
>>
>>
>> Wen.J.Qu
>> 2012-08-24
>> 发件人: Mary Haley
>> 发送时间: 2012-08-23 14:04:55
>> 收件人: Wen.J.Qu@gmail.com
>> 抄送:
>> 主题: Re: [ncl-talk] How to check and fill the missing lines with the missing values?
>>
>> Shawn,
>>
>> I need more information. Are these lines of data in an ascii file?
>>
>> If so, does the file actually look like that, with the "…" characters, or are the lines just blank, or something else?
>>
>> If these lines of data are in an ascii file, how are you reading in the file?
>>
>> If the lines are just blank, then you can read the file in as strings check for blank strings using "str_is_blank".
>>
>> http://www.ncl.ucar.edu/Document/Functions/Built-in/str_is_blank.shtml
>>
>> --Mary
>>
>> On Aug 23, 2012, at 11:09 AM, Wen.J.Qu wrote:
>>
>> > Hello,
>> >
>> > I am dealing with a daily time series of multiVaribles. My problem is that there are some missing days (lines) of the data, like below
>> >
>> > 19980101 ...
>> > 19980102 ...
>> > 19980103 ...
>> > 19980109 ...
>> > 19980201 ...
>> > 19980218 ...
>> > ... ...
>> >
>> > How can I check and fill these missing lines (days) with the missing values?
>> >
>> > Thanks a lot.
>> >
>> >
>> > Shawn
>> >
>> > Wen.J.Qu
>> > 2012-08-23
>> > _______________________________________________
>> > ncl-talk mailing list
>> > List instructions, subscriber options, unsubscribe:
>> > http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>
> <gsod_example>

_______________________________________________
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Fri Aug 24 18:30:42 2012

This archive was generated by hypermail 2.1.8 : Tue Aug 28 2012 - 08:53:45 MDT