NCL Home > Documentation > Functions > General applied math, Statistics

escorc_n

Computes the (Pearson) sample linear cross-correlations at lag 0 only, across the specified dimensions.

Available in version 6.2.1 and later.

Prototype

	function escorc_n (
		x          : numeric,  
		y          : numeric,  
		dims_x [*] : integer,  
		dims_y [*] : integer   
	)

	return_val  :  numeric

Arguments

An array of any numeric type or size. The rightmost dimension is usually time.

An array of any numeric type or size. The rightmost dimension is usually time. The size of the rightmost dimension must be the same as x.

dims_x

A scalar integer indicating which dimension of x to do the calculation on. Dimension numbering starts at 0.

dims_y

A scalar integer indicating which dimension of y to do the calculation on. Dimension numbering starts at 0.

Description

Computes sample linear cross-correlations (Pearson) at lag 0 only. If a lagged correlations is required, use esccr. Missing values are allowed. This function can also be used to determine a "one-point-correlation-map" where one point is used to cross-correlate with all other points (see example 4 below).

Algorithm (N is the sample size):

     cor = [1./(N-1)] * SUM [(X(t)-Xave)*(Y(t)-Yave)]/((Xstd*Ystd))

The dimension sizes(s) of c are a function of the dimension sizes of the x and y arrays. Type double is returned if x or y are double, and float otherwise. The following illustrates dimensioning:

        x(N), y(N)          c
        x(N), y(K,M,N)      c(K,M)
      x(I,N), y(K,M,N)      c(I,K,M)
    x(J,I,N), y(L,K,M,N)    c(J,I,L,K,M)

Special case when dimensions of all x and y are identical:

    x(J,I,N), y(J,I,N)      c(J,I)

The Pearson linear correlation coefficient (r) for n pairs of independent observations can be tested against the null hypothesis (ie.: no correlation) using the statistic

    t = r*sqrt[ (n-2)/(1-r^2) ]

This statistic has a Student-t distribution with n-2 degrees of freedom. See Example 1.

The confidence interval for r may also be estimated. However, since the sampling distribution of Pearson's r is not normally distributed, the Pearson r is converted to Fisher's z-statistic and the confidence interval is computed using Fisher's z. An inverse transform is used to return to r space (-1 to +1). This approach is also demonstrated in Example 1.

Specifically, the confidence interval for the Pearson correlation may be obtained via use of:

the Fischer z-transformation: z = 0.5*log((1+r)/(1-r))
the standard error of the z-transformation: z_se = sqrt(1.0/(N-3))
The inverse of the Fischer transform: ri = (exp(2*z)-1)/(exp(2*z)+1))

Examples

Example 1

The following will calculate the cross-correlation for a two one-dimensional arrays x(N) and y(N).

        r = escorc_n(x,y,0,0)   ; r is a scalar

The following is an example that illustrates calculating the cross-correlation(s) and associated confidence limits.


     ;; http://www.unt.edu/UNT/departments/CC/Benchmarks/sprsum97/resamp.htm:

        x    = (/ 0.20, 1.88, -0.76, 0.42, 0.32, -0.56, 1.55, -1.21, -0.66, -0.96, -0.21 /)
        y    = (/ 0.18, 0.54, -0.49, 0.92, 0.22,  0.75, 0.66, -2.65, -0.51,  0.47, -0.09 /)
        r    = escorc_n(x,y,0,0)          ; Pearson correlation
                                          ; r=0.559956
    ;---Compute correlation confidence interval

        n    = dimsizes(x)                ; n=11
        df   = n-2
                                          ; Fischer z-transformation
        z    = 0.5*log((1+r)/(1-r))       ; z-statistic
        se   = 1.0/sqrt(n-3)              ; standard error of z-statistic

                                          ; low  and hi z values
        zlow = z - 1.96*se                ; 95%  (2.58 for 99%)
        zhi  = z + 1.96*se
                                          ; inverse z-transform; return to r space (-1 to +1)
        rlow = (exp(2*zlow)-1)/(exp(2*zlow)+1)
        rhi  = (exp(2*zhi )-1)/(exp(2*zhi )+1)

        print("r="+r)                     ;  r=0.559956
        print("z="+z+"  se="+se)          ;  z=0.63277  se=0.353553
        print("zlow="+zlow+"  zhi="+zhi)  ;  zlow=-0.0601951  zhi=1.32573
        print("rlow="+rlow+"  rhi="+rhi)  ;  rlow=-0.0601225  rhi=0.868203

Since the r confidence interval includes 0.0, the calculated r is not significant.

An alternative for testing significance is:

        t    = r*sqrt((n-2)/(1-r^2))
        p    = student_t(t, df)
        psig = 0.05                       ; test significance level
        print("t="+t+"  p="+p)            ; t=2.02755  p=0.0732238
        if (p.le.psig) then
            print("r="+r+" is significant at the 95% level"))
        else
            print("r="+r+" is NOT significant at the 95% level"))
        end if

Example 2

The following will calculate the cross-correlation for one three-dimensional array y(lat,lon,time) and one one-dimensional array x(time).

     ccr = escorc_n(x,y,0,2)      ; ccr(nlat,mlon)

Example 3

The following will calculate the cross-correlations for x3(time,lat,lon) and y3(time,lat,lon) and x4(time,lev,lat,lon) and y4(time,lev,lat,lon).

     ccr3 = escorc_n(x3,y3,0,0)      ; ccr3(nlat,mlon)
     ccr4 = escorc_n(x4,y4,0,0)      ; ccr4(klev,lat,lon)

Example 4

Consider x(neval,time) and y(lat,lon,time)

     ccr = escorc_n(x,y,1,2)      ; ccr(neval,nlat,mlon)

Example 5

Consider ya(time,nl,ml) and yb(lat,lon,time) where nl and ml are scalar integers (grid indices) specified by the user. The result is a "one-point correlation pattern". Basically, a specific point is correlated with all other points. NOTE: NCL makes y(:,nl,ml) and yb(nl,ml,:) into one-dimensional arrays. Hence, dimension number for time is 0.

     nl   = 32 ; for example
     ml   = 64
     ccra = escorc_n(ya(:,nl,ml),yb,0,0)   ===> ccra(lat,lon)
     ccrb = escorc_n(ya(nl,ml,:),yb,0,0)   ===> ccrb(lat,lon)

escorc_n

Prototype

Arguments

Description

See Also

Examples