Re: Where function inserting missing values. Why?

From: Dennis Shea <shea_at_nyahnyahspammersnyahnyah>
Date: Tue Oct 20 2009 - 16:20:04 MDT

=====
I am attaching a code which has *nothing* to do with your specific problem.

However, it does have some things you might find of interest.
The multiple files are from a gridded satellite dataset over a small region.
In some cases all the values in the "Measurements" dimension are missing.

The person had small memory available so the it was not possible
to keep everything in memory. There are places where there
could be no data.

Note how the "where" function is used. I explicitly used a
variable [kdata] to key the computations.

Also, a 'trick' to allow array syntax when division be zero may occur
is to set count variable to _FillValue.

D

Bridget Thrasher wrote:
> It is only missing values now because of the way you wrote your test
> where statement. If you look back at my original post, you will see
> that it is not missing values until the final where statement. The
> arrays data_all and mon_avg_20c are both complete at all grid points
> and times. The arrays dvar and mon_avg_obs contain missing values in
> cells over oceans.
>
> I originally did have the first 2 where statements combined. However,
> your solution will not create the correct output for all land points
> because it will use the incorrect data sets to calculate the value for
> land cells that have a monthly average obs value of 0 instead of
> setting those cells to 0, which is what I'm trying to do in the final
> where statement that is causing the problem.
>
> -Bridget
>
> On Tue, Oct 20, 2009 at 2:48 PM, Wei Huang <huangwei@ucar.edu
> <mailto:huangwei@ucar.edu>> wrote:
>
> Bridget,
>
> Here, clearly that you missing values of factors, matches mon_avg_obs.
>
> So, you may want to change the whole paragraph:
>
> mon_avg_obs(m,:,:) =
> where(mon_avg_obs(m,:,:).eq.0,-2222,mon_avg_obs(m,:,:))
> factors(a*12+m,:,:) = data_all(a*12+m,:,:)/mon_avg_20c(m,:,:)
> print("1. "+num(ismissing(factors(a*12+m,:,:))))
>
> factors(a*12+m,:,:) = where(.not.ismissing(mon_avg_obs(m,:,:)),
> dvar(a*12+m,:,:)/mon_avg_obs(m,:,:),factors(a*12+m,:,:))
> print("2. "+num(ismissing(factors(a*12+m,:,:))))
>
> printVarSummary(mon_avg_obs)
> printMinMax(mon_avg_obs,True)
> printVarSummary(factors(a*12+m,:,:))
> printMinMax(factors(a*12+m,:,:),True)
> print("debug:
> "+num(.not.ismissing(mon_avg_obs(m,:,:)).and.(mon_avg_obs(m,:,:).eq.-2222)))
>
> factors(a*12+m,:,:) =
> where(.not.ismissing(mon_avg_obs(m,:,:)).and.(mon_avg_obs(m,:,:).eq.-2222),0.,factors(a*12+m,:,:))
> print("3. "+num(ismissing(factors(a*12+m,:,:))))
>
>
> to:
>
> factors(a*12+m,:,:) = where(ismissing(mon_avg_obs(m,:,:))
> .or. mon_avg_obs(m,:,:) .le. 0., \
>
> data_all(a*12+m,:,:)/mon_avg_20c(m,:,:), \
>
> dvar(a*12+m,:,:)/mon_avg_obs(m,:,:))
>
> Then check with:
>
> print("3. "+num(ismissing(factors(a*12+m,:,:))))
> printVarSummary(factors(a*12+m,:,:))
> printMinMax(factors(a*12+m,:,:),True)
>
>
> Wei Huang
> huangwei@ucar.edu <mailto:huangwei@ucar.edu>
> VETS/CISL
> National Center for Atmospheric Research
> P.O. Box 3000 (1850 Table Mesa Dr.)
> Boulder, CO 80307-3000 USA
> (303) 497-8924
>
>
>
>
>
>
>
> --
> Bridget Thrasher, PhD
> Postdoctoral Researcher
> Climate Central
> www.climatecentral.org <http://www.climatecentral.org>
>

-- 
======================================================
Dennis J. Shea                  tel: 303-497-1361    |
P.O. Box 3000                   fax: 303-497-1333    |
Climate Analysis Section                             |
Climate & Global Dynamics Div.                       |
National Center for Atmospheric Research             |
Boulder, CO  80307                                   |
USA                        email: shea 'at' ucar.edu |
======================================================

;Hi Dennis,
;so this is mainly what I have till now. I can't guarantee the code to work this way, 'cause I copy and pasted it out of another piece of code.
;

load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_csm.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"

function parseFileName ( fNam:string )
; parse the file names and extract data information
; 1 2
; europe_0.05deg_19980703_rad_nopav.nc
local nNam, output, tmp_c
begin
   nNam = dimsizes( fNam )

   output = new( 4,integer,"No_FillValue")
   tmp_c = stringtochar(fNam)

   output(0) = stringtointeger((/tmp_c(15:18)/)) ; YYYY
   output(1) = stringtointeger((/tmp_c(19:20)/)) ; MM
   output(2) = stringtointeger((/tmp_c(21:22)/)) ; DD
   output(3) = stringtointeger((/tmp_c(15:22)/)) ; YYYYMMDD

   return (output)
end

; Main script

  varNam = (/"r055","r067","r087","r16","r37","BT11","BT12"/)
  nvar = dimsizes(varNam)
  
  diri1 = "./"
  filKey= "europe_0.05deg_199807??_rad_nopav.nc"
  fili1 = systemfunc("cd "+diri1 +" ; ls "+filKey)
  nfil1 = dimsizes(fili1)
  
  f = addfile (diri1+fili1(0), "r")
        ; print(f)
  lat = f->lat
  lon = f->lon
  nlat = dimsizes(lat)
  nlon = dimsizes(lon)
                                 ; double for summming
  array = new( (/nvar,nlat,nlon/), "double" , -999.)
  array2= new( (/nvar,nlat,nlon/), "double" , -999.)
  knt = new( (/nvar,nlat,nlon/), "double" , -999 )
  
  array = 0.0d
  array2= 0.0d
  knt = 0.0d

  wcStrt= systemfunc("date") ; used to time to loopa below
  
  filo = "TEST.nc"
  diro = "./"
  system("/bin/rm -f "+diro+filo)
  fo = addfile(diro+filo, "c")
  
  do nf=0,nfil1-1
    
    f = addfile(diri1+fili1(nf),"r")
    date = parseFileName( fili1(nf) )
    ymd = date(3) ; yyyymmdd
    
    do nv=0,nvar-1
        data = f->$varNam(nv)$ ; (lat,lon,Measurements)=>(0,1,2)
        kdata = dim_num_n(.not.ismissing(data), 2) ; # good "Measurements"

        array(nv,:,:) = array(nv,:,:) + where(kdata.gt.0, dim_sum_n(data ,2), 0d)
        array2(nv,:,:) = array2(nv,:,:) + where(kdata.gt.0, dim_sum_n(data^2,2), 0d)
        knt(nv,:,:) = knt(nv,:,:) + where(kdata.gt.0, kdata, 0 )
                       
        print("nf="+nf+" "+ymd+" "+varNam(nv)+" min="+min(data)+" max="+max(data) )
      ;;print(" min(array) ="+min(array(nv,:,:)) +" max(array) ="+max(array(nv,:,:)) )
      ;;print(" min(array2)="+min(array2(nv,:,:)) +" max(array2)="+max(array2(nv,:,:)))
      ;;print(" min(knt)="+min(knt(nv,:,:)) +" max(knt)="+max(knt(nv,:,:)))

    end do ; nv end
  end do ; nf end
  
                       ; trick to allow array division without 0.0
  knt = where(knt.eq.0.0d, knt@_FillValue, knt)

                       ; one pass computational technique
                       ; dataStd, dataAvg, dataCoef are temporary arrays
  dataStd = sqrt( (array2-(array*array/knt))/knt)
  dataAvg = array/knt
  dataCoef= dataStd/dataAvg
  
  dataAvg!0 = "variable"
  dataAvg!1 = "lat"
  dataAvg!2 = "lon"
  dataAvg&lat = lat
  dataAvg&lon = lon
  copy_VarCoords(dataAvg, dataStd)
  copy_VarCoords(dataAvg, dataCoef)

  printVarSummary(dataAvg)

  wallClockElapseTime(wcStrt, "Processing", 0)

;==========================================================
; Write netCDF
;==========================================================
  do nv=0,nvar-1
    ;if (isfilevaratt(f, varNam(nv), "units") ) then
    ; units = f->$varNam(nv)$@units
    ;end if
  
     xmean = dataAvg(nv,:,:) ; clarity only!!!
     xName = varNam(nv)+"_mean"
     xmean@long_name = xName
    ;xmean@units = units
    ;fo->$xName$ = xmean ; double
     fo->$xName$ = dble2flt(xmean) ; reduce to float [not necessary]
  
     xstd = dataStd(nv,:,:) ; clarity only!!!
     xName = varNam(nv)+"_std"
     xstd@long_name = xName
    ;xstd@units = units
    ;fo->$xName$ = xstd ; double
     fo->$xName$ = dble2flt(xstd) ; reduce to float [not necessary]
  
     xcoef = dataCoef(nv,:,:) ; clarity only!!!
     xName = varNam(nv)+"_var_coef"
     xcoef@long_name = xName
    ;fo->$xName$ = xcoef ; double
     fo->$xName$ = dble2flt(xcoef) ; reduce to float [not necessary]
  
     if (isvar("units")) then
         delete( units )
     end if
  end do ; end nv/nvar
Received on Tue Oct 20 16:20:00 2009

This archive was generated by hypermail 2.1.8 : Thu Oct 22 2009 - 13:09:30 MDT