Re: esccr oddities

From: Dennis Shea (shea AT ucar.edu)
Date: Mon Feb 21 2005 - 16:51:48 MST

  • Next message: Jing: "the difference between new eof function and old eof function"

    [snip]
    > I'm using time series that have 108 points in them and am only
    > looking at lags out to two or three so I don't anticipate any strange
    > statistical problems.
    [snip]
    > Could a few outliers in the data which are
    > near but not equal to a very large fill value cause this behavior? (i.e.
    [snip]

    Under unusual circumstances, it may be possible to get
    lag correlation less-than/greater-than -1 and and +1.
    The reason is the normalization used.

    When a cross-correlation at 0-lag is computed, the standard deviations
    [say, xsd and ysd] are used to normalize the xy-covariance. This ensures
    that the cross correlation will range {-1 to +1}.

    When computing lag correlations, the number of pts changes for each lag.
    [npts at lag-0, npts-1 at lag-1, etc]. This requires recalculation of
    the std. dev. for each lag. Computationally, no problem. Just a bit
    more time. However, for the statistical reasons outlined in the
    Chatfield reference referred to in the "Description" section, the overall
    standard deviation is used in the esccr function. WHat follows is an
    excerpt from the fortran code used by NCL's esccr.

          xsdysd = 1./(xsd*ysd)

          do lag=0,mxlag
             xyn = 0.
             xy1 = 0.
           do n=1,npts-lag
              if (x(n).ne.xmsg .and. y(n+lag).ne.ymsg) then
                  xy1 = xy1 + ((y(n+lag)-ymean)*(x(n)-xmean))
                  xyn = xyn + 1.
              endif
           enddo
             if (xyn.ge.2.) then
                 ccv(lag) = xy1/(xyn-1.)
                 ccr(lag) = ccv(lag)*xsdysd
             else
                 ccv(lag) = xmsg
                 ccr(lag) = xmsg
             endif
          enddo

    The situation you refer to, specificallyi: "a few outliers"
    is *the* type of situation which will cause the anomalous
    behavior. Although not a statistician, when standard deviations
    are used, I think it is implicitly assumed that the underlying
    distributions are normally distributed. I speculate that your
    outliers may invalidate that assumption.

    My advice is to eliminate the outliers prior to computing the
    cross correlations.

    Note: it has nothing to do with how close or distant the values
          are to the _FillValue.

    ===========================
    > But, occasionally I notice trouble with fill
    > values in files generated by different applications or originally stored as
    > doubles but coerced to floats.

    When u encounter _FillValue problem, pls let us knoe.

    To my knowledge, doubles can not be coerced to floats.
    A function [eg, doubletofloat] must be invoked.

    ================================
    D
    _______________________________________________
    ncl-talk mailing list
    ncl-talk@ucar.edu
    http://mailman.ucar.edu/mailman/listinfo/ncl-talk



    This archive was generated by hypermail 2b29 : Tue Feb 22 2005 - 09:00:43 MST