Re: NCL default missing values

From: David Brown <dbrown_at_nyahnyahspammersnyahnyah>
Date: Fri, 18 Apr 2008 16:43:13 -0600

Well the 64-bit integer type will allow for much larger values than
the 32 bit integers we currently use (> 9e18 for the the signed type).
Note that on a 64 bit machine you can already get such values using
the long type (although of course
it reduces portability between the different classes of machines).
As for overflow checking, we have considered this, but we are
concerned with performance implications
for large datasets. Perhaps we can develop it as an option.
  -dave

On Apr 18, 2008, at 4:15 PM, Jonathan Vigh wrote:

> Hi Dave (and NCLers),
> Somewhat related to this, are there any plans for NCL to check
> for underflow or overflow? And is there currently any way to
> represent integers larger than 2147483647 (~2.1 x 10^9)? This of
> course is also not just an NCL issue - it will cause the Unix
> version of the Y2K bug around the year 2037 since Unix keeps times
> in seconds since 00:00:00 on 01-01-1970. I ran into this issue
> because I am trying to work with some flight level hurricane
> aircraft data with a variety of not-so-nicely formatted date/time
> strings and it seemed easiest to convert all the times to an offset
> in seconds from a common basetime. I figured I'd go with the Unix
> approach which seems pretty standard. But since I'm still fairly
> age-challenged, it's conceivable I could still be around and
> working in 2037, so out of curiosity, I was wondering what a good
> fix would be (I figure it never hurts to be ahead of the curve :).
> I'm on a 32-bit machine right now, so I don't think the long type
> helps me.
>
> I just did the following test:
>
> ncl 8> int=2000000000
> ncl 9> print(int)
>
> Variable: int
> Type: integer
> Total Size: 4 bytes
> 1 values
> Number of Dimensions: 1
> Dimensions and sizes: [1]
> Coordinates:
> Number Of Attributes: 1
> _FillValue : -999
> (0) 2000000000 <- everything okay, no
> overflow
>
> ncl 3> int=4000000000
> ncl 4> print(int)
>
> Variable: int
> Type: integer
> Total Size: 4 bytes
> 1 values
> Number of Dimensions: 1
> Dimensions and sizes: [1]
> Coordinates:
> Number Of Attributes: 1
> _FillValue : -999
> (0) 2147483647 <- as expected, overflow
> (unreported)
>
> So your question about missing values was rather timely. If there
> are no plans for NCL to check for underflow and underflows in the
> near future (maybe by 2037? . . .), it might be a good idea to use
> those maximum representable value for the default Fill Values. That
> would help users like me to be more familiar with the limits and
> recognize when an underflow/overflow has occurred. If we forget
> what the limit is, finding out would be as easy as print
> (var@_FillValue).
>
> Just a thought,
> Jonathan
>
>
>
>
> David Brown wrote:
>
>>
>> Hi NCL users,
>>
>> We are tentatively planning to modify NCL's default fill values
>> in order to put them further
>> outside the range of "normal" data sets and computations. This
>> will apply to variables defined in NCL
>> as well as to variables defined in GRIB files as presented by
>> NCL. The motivating factor for this
>> change is that recently we have encountered GRIB data where the
>> data range includes the value used
>> for float type fill value (-999), leading to a situation where a
>> few valid values are treated as missing.
>> Of course this has always been a possibility but has not been
>> encountered in practice, or at least,
>> it has not jumped out as a problem in our tests or reported as a
>> bug by any users until now. HDF and
>> NetCDF file variables will not be affected because the NCL
>> representation of the variable only contains
>> a _FillValue attribute if it is defined in the file.
>>
>> We only want to do this once so we would like to get input on the
>> best values to use, as well as
>> feedback on possible problems to existing code.
>>
>> Our plan is to change the default fill values for float and
>> double values to the value 1.0e20.
>>
>> The byte, character, and and string missing values (255, inttochar
>> (0), and "missing") will not change.
>>
>> Note that the long type changes size between 32 and 64 bit
>> machines. On 64 bit machines the long type can hold much bigger
>> values, but I think we want to ensure that the default long fill
>> value will always be the same.
>> Therefore the default long fill value and the default integer
>> fill value will probably be equal.
>>
>> For int, long and short we could follow the example of NetCDF and use
>>
>> long and int:
>> -2147283647 (maximum acceptable INT_MIN according to IEEE std.
>> 1003.1, 2004 edition)
>> short
>> -32767 (maximum acceptable SHRT_MIN)
>>
>> but it could certainly be argued that something with all '9''s
>> such as +/-999999999 (for long and integer) and (+/-9999) for short
>> would be easier to type, remember, and recognize when visually
>> scanning the contents of a variable.
>>
>> Other suggestions and arguments pro or con any alternatives are
>> welcome.
>>
>> FYI, we are planning soon to add more integer types including
>> especially a 64-bit integer type, but also specifically unsigned
>> versions of all the types.
>> Our decisions concerning fill values for the existing types will
>> be extrapolated to come up with fill values for these new types.
>>
>> -dave
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> ncl-talk mailing list
>> ncl-talk_at_ucar.edu
>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>
>

_______________________________________________
ncl-talk mailing list
ncl-talk_at_ucar.edu
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Fri Apr 18 2008 - 16:43:13 MDT

This archive was generated by hypermail 2.2.0 : Tue Apr 22 2008 - 09:55:07 MDT