NCL Home > Documentation > Functions > Empirical orthogonal functions

eofunc

Compute empirical orthogonal functions (EOFs, aka: Principal Component Analysis).

Prototype

	function eofunc (
		data    : numeric,  
		neval   : integer,  
		optEOF  : logical   
	)

	return_val  :  numeric

Arguments

data

A multi-dimensioned array in which the rightmost dimension is the number of observations. Generally, this is the time dimension.

neval

A scalar integer that specifies the number of eigenvalues and eigenvectors to be returned. This is usually less than or equal to the minimum number of observations or number of variables.

optEOF

A logical variable to which various optional arguments may be assigned as attributes. These optional arguments alter the default behavior of the function. Must be set to True prior to setting the attributes which are assigned using the @ operator:

Return value

A multi-dimensional array containing normalized EOFs. The returned array will be of the same size as data with the rightmost dimension removed and an additional leftmost dimension of the same size as neval added. Double if data is double, float otherwise.

The return variable will have associated with it the following attributes:

These attributes can be accessed using the @ operator:
print(return_val@pcvar)
print(return_val@pcrit)

Description

Computes Empirical Orthogonal Functions (EOFs) via a covariance matrix or, optionally, via a correlation matrix. This is also known as Principal Component Analysis or Eigen Analysis. The EOFs are calculated using LAPACK's "dspevx" routine. Missing values are ignored when computing the covariance or correlation matrix. The returned values are normalized such that the sum of squares for each EOF pattern equals one. To denormalize the returned EOFs multiply by the square root of the associated eigenvalue.

This function differs from eofcov and eofcor in that it may transpose the input data array prior to computing the EOFs. If data is transposed, a linear transformation is applied to the EOFs of the transposed array prior to returning. The reason for using this approach is computational efficiency.

NOTE: The version of eofunc in Version 4.2.0.a032 had an error. The exact source of the error was never determined. Prior to release in a032, the eofunc function had been tested by five different people and all agreed it was correct. After release of a032, some users reported small differences with previous results. One user reported a major problem. As a result, the eofunc a032 documentation was modified to include a notice printed in red that recommended that users use the older eofcov or eofcor functions. The new eofunc included in a033 uses different code to calculate EOFs. The bug was fixed in version a033.

Comments on weighting observations

Generally, when performing and EOF analysis on observations over the globe or a portion of the globe, the values are weighted prior to calculating. This is usually required to account for the convergence of the meridions (area weighting) which lessens the impact of high-latitude grid points that represent a small area of the globe. Most frequently, the square root of the cosine of the latitude is used to compute the area weight. The square root is used to create a covariance matrix that reflects the area of each matrix element. If weighted in this manner, the resulting covariance values will include quantities calculated via:

[x*sqrt(cos(lat(x)))]*[y*sqrt(cos(lat(y)))] = x*y*sqrt(cos(lat(x))*sqrt(cos(lat(y))
Note that the covariance of a grid point with itself yields standard cosine weighting:
[x*sqrt(cos(lat(x)))]*[x*sqrt(cos(lat(x)))] = x^2 * cos(lat(x)).
Note on standard EOF analysis

Conventional EOF analysis yields patterns and time series which are both orthogonal. The derived patterns are a function of the domain. The results may produce patterns that are similar to physical modes within the the system. However, the procedure is strictly mathematical (not statistical) and is not based upon physics.

Use eofunc_Wrap if retention of metadata is desired.

See Also

eofunc_ts, eofunc_varimax

Examples

In the following, the attribute pcvar can be output via:

  print(ev@pcvar)             ; 1D vector of length "neval"

This attribute could also be used in graphics. For example, it is it could be used in a title.

  title = "%=" + ev@pcvar(1)

sprintf can be used to format the title more precisely:

  title = "%=" + sprintf("%5.2f", ev@pcvar(1) )
Example 1

Let x be two dimensional with dimensions variables (size = nvar) and time:

  neval  = 3                         ; calculate 3 EOFs out of 7 
  ev     = eofunc(x,neval,False)   ; ev(neval,nvar)

                                     ; print the percent-variance explained
  print("% var="+ ev@pcvar)
  
  option      = True
  option@jopt = 1                    ; use correlation matrix
  ev_cor = eofunc(x,neval,option)  ; ev_cor(neval,nvar)

                                     ; print the percent-variance explained
  print("% var="+ ev_cor@pcvar)
Example 2

Let x be three-dimensional with dimensions of time, lat, lon. Reorder x so that time is the rightmost dimension:

  y!0    = "time"                  ; name dimensions if not already done 
  y!1    = "lat"                   ; must be named to reorder
  y!2    = "lon"                   

  neval  = nvar                                  ; calculate all EOFs 
  ev     = eofunc(y(lat|:,lon|:,time|:),neval,False)   
  ; ev(neval,nlat,nlon)

                                   ; denormalize the EOFs [units dame as data]
  do ne=0,neval-1
     ev(ne,:,:) = ev(ne,:,:)*sqrt( ev@eval(ne) )
  end do
Example 3

Let z be four-dimensional with dimensions lev, lat, lon, and time:

  neval  = 3                       ; calculate 3 EOFs out of klev*nlat*mlon 
  ev     = eofunc(z,neval,False)      
; ev will be dimensioned neval, level, lat, lon
Example 4

Calculate the EOFs at every other point rather. Use of a temporary array is NOT necessary but it avoids having to reorder the array twice in this example:

  neval  = 5                          ; calculate 5 EOFs out of nlat*mlon 
  zTemp  = z(lat|::2,lon|::2,time|:)  ; reorder and use temporary array
  ev     = eofunc(zTemp,neval,False)   ; ev(neval,nlat/2,mlon/2)
Example 5

Let z be four-dimensional with dimensions level, lat, lon, time. Calculate the EOFs at one specified level:

  kl     = 3                               ; specify level
  neval  = 8                               ; calculate 8 EOFs out of nlat*mlon 
  ev     = eofunc(z(kl,:,:,:),neval,False)  
; ev will be dimensioned neval, lat, lon 
Example 6

Let z be four-dimensional with dimensions time, lev, lat, lon. Reorder x so that time is the rightmost dimension and calculate on one specified level:

  kl     = 3                             ; specify level
  neval  = 8                             ; calculate 8 EOFs out of nlat*mlon 
  zTemp  = z(lev|kl,lat|:,lon|:,time|:)   
  ev     = eofunc(zTemp,neval,False)      
; ev will be dimensioned neval, lat, lon
Example 7

Area weight the data prior to calculation. Let p be four-dimensional with dimensions lat, lon, and time. The array lat contains the latitudes.

; calculate the weights using the square root of the cosine of the latitude and
; also convert degrees to radians
  wgt = sqrt(cos(lat*0.01745329)) 
  
; reorder data so time is fastest varying                                      
  pt  = p(lat|:,lon|:,time|:)         ; (lat,lon,time)
  ptw = pt                            ; create an array with metadata

; weight each point prior to calculation. 
; conform is used to make wgt the same size as pt
  ptw = pt*conform(pt, wgt, 0)        
                                      
  evec= eofunc(ptw,neval,80.)   

                                   ; denormalize the EOFs
                                   ; print the % variance explained
  do ne=0,neval-1
     evec(ne,:,:) = evec(ne,:,:)*sqrt( evec@eval(ne) )  ; units dame as data

     print("% var="+ evec@pcvar(ne) )
  end do