NCL Home > Documentation > Functions > General applied math, Statistics

escorc

Computes the (Pearson) sample linear cross-correlations at lag 0 only.

Prototype

	function escorc (
		x  : numeric,  
		y  : numeric   
	)

	return_val  :  numeric

Arguments

An array of any numeric type or size. The rightmost dimension is usually time.

An array of any numeric type or size. The rightmost dimension is usually time. The size of the rightmost dimension must be the same as x.

Description

Computes sample linear cross-correlations (Pearson) at lag 0 only. If a lagged correlations is required, use esccr. Missing values are allowed. This function can also be used to determine a "one-point-correlation-map" where one point is used to cross-correlate with all other points (see example 4 below).

Algorithm (N is the sample size):

     cor = [1./(N-1)] * SUM [(X(t)-Xave)*(Y(t)-Yave)]/((Xstd*Ystd))

The dimension sizes(s) of c are a function of the dimension sizes of the x and y arrays. Type double is returned if x or y are double, and float otherwise. The following illustrates dimensioning:

        x(N), y(N)          c
        x(N), y(K,M,N)      c(K,M)
      x(I,N), y(K,M,N)      c(I,K,M)
    x(J,I,N), y(L,K,M,N)    c(J,I,L,K,M)

Special case when dimensions of all x and y are identical:

    x(J,I,N), y(J,I,N)      c(J,I)

The Pearson linear correlation coefficient (r) for n pairs of independent observations can be tested against the null hypothesis (ie.: no correlation) using the statistic

    t = r*sqrt[ (n-2)/(1-r^2) ]

This statistic has a Student-t distribution with n-2 degrees of freedom. See Example 1.

The confidence interval for r may also be estimated. However, since the sampling distribution of Pearson's r is not normally distributed, the Pearson r is converted to Fisher's z-statistic and the confidence interval is computed using Fisher's z. An inverse transform is used to return to r space (-1 to +1). This approach is also demonstrated in Example 1.

Specifically, the confidence interval for the Pearson correlation may be obtained via use of:

the Fischer z-transformation: z = 0.5*log((1+r)/(1-r))
the standard error of the z-transformation: z_se = sqrt(1.0/(N-3))
The inverse of the Fischer transform: ri = (exp(2*z)-1)/(exp(2*z)+1))

----------------------------------------------------------------------
Use escorc_n if the dimension to do the calculation on is not the rightmost dimension and reordering is not desired. This function can be significantly faster than escorc.
----------------------------------------------------------------------

Examples

Example 1

The following will calculate the cross-correlation for a two one-dimensional arrays x(N) and y(N).

        r = escorc(x,y)   ; ccr is a scalar

The following is an example that illustrates calculating the cross-correlation(s) and associated confidence limits.


     ;; http://www.unt.edu/UNT/departments/CC/Benchmarks/sprsum97/resamp.htm:

        x    = (/ 0.20, 1.88, -0.76, 0.42, 0.32, -0.56, 1.55, -1.21, -0.66, -0.96, -0.21 /)
        y    = (/ 0.18, 0.54, -0.49, 0.92, 0.22,  0.75, 0.66, -2.65, -0.51,  0.47, -0.09 /)
        r    = escorc(x,y)                ; Pearson correlation
                                          ; r=0.559956
    ;---Compute correlation confidence interval

        n    = dimsizes(x)                ; n=11
        df   = n-2
                                          ; Fischer z-transformation
        z    = 0.5*log((1+r)/(1-r))  ; z-statistic
        se   = 1.0/sqrt(n-3)                       ; standard error of z-statistic

                                          ; low  and hi z values
        zlow = z - 1.96*se                ; 95%  (2.58 for 99%)
        zhi  = z + 1.96*se                 
                                          ; inverse z-transform; return to r space (-1 to +1)
        rlow = (exp(2*zlow)-1)/(exp(2*zlow)+1)
        rhi  = (exp(2*zhi )-1)/(exp(2*zhi )+1)

        print("r="+r)                     ;  r=0.559956                
        print("z="+z+"  se="+se)          ;  z=0.63277  se=0.353553 
        print("zlow="+zlow+"  zhi="+zhi)  ;  zlow=-0.0601951  zhi=1.32573
        print("rlow="+rlow+"  rhi="+rhi)  ;  rlow=-0.0601225  rhi=0.868203

Since the r confidence interval includes 0.0, the calculated r is not significant.

An alternative for testing significance is:

        t    = r*sqrt((n-2)/(1-r^2))      
        p    = student_t(t, df)
        psig = 0.05                       ; test significance level                     
        print("t="+t+"  p="+p)            ; t=2.02755  p=0.0732238
        if (p.le.psig) then
            print("r="+r+" is significant at the 95% level"))
        else
            print("r="+r+" is NOT significant at the 95% level"))
        end if

Example 2

The following will calculate the cross-correlation for one two-dimensional array y(lat,lon,time) and one one-dimensional array x(time). Both significance and confidence intervals may be estimated using the approach outlined in Example 1.

     ccr = escorc(x,y)      ; ccr(lat,lon)

Example 3

Consider x(neval,time) and y(lat,lon,time)

     ccr = escorc(x,y)      ; ccr(neval,lat,lon)

Example 4

Consider y(nl,ml,time) where nl and ml are specified by the user and y(lat,lon,time). The result is a "one-point correlation pattern". Basically, a specific point is correlated with all other points. Both significance and confidence intervals may be estimated using the approach outlined in Example 1.

     nl  = 32 ; for example
     ml  = 64
     ccr = escorc(y(nl,ml,:),y)   ===> ccr(lat,lon)

Example 5

Consider w(time,lat,lon) and q(time,lat,lon) where 'time', 'lat' and 'lon' are named dimensions: Compute the temporal correlation at each latitude and longitude. Dimension reordering must be used to make 'time' the rightmost dimension.

     ccr = escorc(w(lat|:,lon|:,time|:),q(lat|:,lon|:,time|:))   ===> ccr(lat,lon)

Note: with NCL V6.2.1 or later, you can use escorc_n to avoid having to reorder the array first:

     ccr = escorc_n(w,q,0,0)   ===> ccr(lat,lon)

escorc

Prototype

Arguments

Description

See Also

Examples