Re: eofunc & eofunc_varimax from Paulo de Melo on 2006-07-13 (ncl-talk 2006 archive)

From: Paulo de Melo <pmelo_at_nyahnyahspammersnyahnyah>
Date: Fri, 14 Jul 2006 03:45:48 +0100

Hello Dennis,

please accept my delayed thanks. I performed some tests that I would like to
share with you.

To test the eigenvalues of (the covariance matrix of) X(S,T), where S
indicates spatial dimensions and T the time dimension, computed with eofunc
and eofcov, I used svd_lapack to compute the singular values of X/sqrt(nT-1).
The squared singular values should be equal the eigenvalues. The result was:

- eofcov eigenvalues are equal to svd_lapack squared singular values

- eofunc eigenvalues are NOT equal to svd_lapack squared singular values

- the sum of all eofcov eigenvalues are equal to the sum of all svd_lapack
squared singular values (and is equal to the sum of all diagonal elements of
the covariance matrix of the data, so it is correct)

- the sum of all eofunc eigenvalues are NOT equal to the sum of all
svd_lapack squared singular values

- (however) the ratio of an eigenvalue by the sum of all eigenvalues yields
the same results for eofcov and eofunc (maybe that's why the pcvar of eofunc
is equal to the pcvar of eofcov)

You told me that, since nT < nS, eofunc used the (nT x nT) temporal covariance
matrix instead of the (nS X nS) spatial covariance matrix. However, as you
said, by the Eckhart-Young theorem, the eigenvalues are unique: the nT
eigenvalues of the (nT x nT) cov. mat. are equal to the nT non-zero
eigenvalues of the (nS x nS) cov. matrix. That's why I think the eofunc
eigenvalues should be equal to the eofcov eigenvalues. The MATLAB's eig
function, which computes the eigenvectors and eigenvalues of a matrix,
yields the same eigenvalues for both covariance matrices. Also, svd_lapack
yields the same singular values for X/sqrt(nT-1) and for
transpose(X)/sqrt(nT-1). Because of this, I don't understand the behavior of
eofunc: the eigenvalues and their total sum (which should represent the total
variance of the system)

All this lead me to the conclusion that the values given by the eval attribute
of eofunc and their total sum have no (known) meaning when treated
separately. Only their ratio has. Please forgive me if this is an
absurd conclusion, but I am unable to find another explanation. Could you
please elucidate me on this?

What about the pcvar_varimax attribute of eofunc_varimax? Maybe there is a
documentation error, but the attribute is still returned by the function. What
does it mean?

Best regards,

Paulo

On Tuesday 11 July 2006 05:29, Dennis Shea wrote:
> Hello,
>
> > I having problems with the EVAL attribute of the EOFUNC funtion. The
> > eigenvalues are different from those computed with the EOFCOV
> > function, which are correct. The % explained variance given by the PCVAR
> > attribute, which had a problem in NCL 4.2.0.a032, are correct. Note that
> > I'm now running NCL a033 which fixed the @pcvar bug.
>
> [1] There was more than a @pcvar bug.
> FYI: We were never able to find the bug with a032 eofunc.
>
> [2] It is quite possible that the eigenvalues calculated by
> eofcov and eofunc could be different.
>
> X(S,T) where S indicates Spatial dimensions an T the time dimension.
>
> eofcov always computed the spatial covariance matrix: [SxS]
> Using an LAPACK routine it computes S eigenvalues and returns
> the user specified number of eigenvalues. Note: this means that
> the system variance is spread over all S eigenvalues.
>
> Historically, people have derived EOFs directly from [SxS].
> However, as the spatial grids became larger, the time
> to derive the EOFs became much longer.
>
>
> eofunc uses a different approach. There is a theorem [Eckhart-Young]
> that states that the number of unique eigenvalues is the
> min(S,T). eofunc solves the smaller covariance matrix
> [SxS] or [TxT]. Generally, T<S so, I speculate, eofunc
> solved the [TxT] matrix and then did a transformation
> to get the sppatial EOFs. Note ... this means the
> variance is spread over T rather than S eigenvalues.
>
> note1: eofunc is usually much faster than eofcov
> because (often) it uses a much smaller covariance matrix.
>
> note2: NCL uses symmetric storrage mode for the covariance
> matrix to save space.
>
> [3] eofcov eigenvalues "which are correct".
> Based on [2] both could be correct but different! :-)
> eofcov solve SxS, eofunc likely solved TxT
>
> The thing that 'saves-the-day' is that for geophysical data
> the eigenvalue spectrum is "red". Often they drop off quickly.
>
> > I'm also getting errors with the PCVAR_VARIMAX attribute of the
> > EOFUNC_VARIMAX function: the variances have absurd results and do not sum
> > to 100% (when I compute all the EOFs). Interestingly, when I rescale the
> > EOFs, prior to the rotation operation, they do sum to 100% but they are
> > not in decreasing order. (The scaling used is the one that results in EOF
> > loadings equal to the correlations between the data and the PC, that is,
> > each EOF loading is multiplied by the square root of the eigenvalue and,
> > if the covariance matrix was used, divided by the standard deviation of
> > the data.)
> >
> > The EOFUN_VARIMAX documentation about the % explained variance is
> > contradictable: in the RETURN VALUE part it is said that it is "returned
> > as an attribute of the returned value called pcvar_varimax" and in the
> > DESCRIPTION part it is said that it "is not returned".
>
> I think there is a documentation error.
>
> The variamax functions have never returned % variance.
> I will check tomorrow.
>
> ---
>
> D
>
> _______________________________________________
> ncl-talk mailing list
> ncl-talk_at_ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
_______________________________________________
ncl-talk mailing list
ncl-talk_at_ucar.edu
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Thu Jul 13 2006 - 20:45:48 MDT

This archive was generated by hypermail 2.2.0 : Mon Jul 17 2006 - 11:02:00 MDT