compute pca with this useful trick

Question

A is matrix (m rows, n cols), each row is an object, and each cols is a feature (a dimension). Typically, I compute the pca based on the covariance matrix, that is A'A, A' is the transposed matrix of A.

Today I read a book which presents a useful trick to compute pca, that is if n >> m, then we can compute the eigenvectors of the matrix AA', which might save a lot of memory, here is the code from the book:

def pca(X):
    """
    Principal Component Analysis
    input: X, matrix with training data stored as flattened arrays in rows
    return: projection matrix (with important dimensions first), variance
    and mean.
    """
    # get dimensions
    num_data,dim = X.shape
    # center data
    mean_X = X.mean(axis=0)
    X = X - mean_X

    # PCA - compact trick used
    M = dot(X,X.T)        # covariance matrix, AA', not the A'A like usual
    e,EV = linalg.eigh(M) # compute eigenvalues and eigenvectors
    tmp = dot(X.T,EV).T   # this is the compact trick
    V = tmp[::-1]         # reverse since last eigenvectors are the ones we want
    S = sqrt(e)[::-1]     # reverse since eigenvalues are in increasing order
    for i in range(V.shape[1]):
        V[:,i] /= S       # What for?

    # return the projection matrix, the variance and the mean
    return V,S,mean_X

Now I understand the algebra behind this useful trick, but there is something confuses me, that is the for-loop, why divide V by S? Normolize the V to unit-length?

Of course, in general, forming the cross-product matrix $\mathbf A\mathbf A^\top$ is a bad idea; one should use singular value decomposition instead. — J. M. ain't a mathematician, Jun 02 '13 at 13:06
@Ｊ.M., well, according to the book I read, SVD tend to be slow under some circumstances. — avocado, Jun 02 '13 at 13:10
You're computing eigenvalues and eigenvectors, which takes the same amount of effort... so I don't understand your objection. — J. M. ain't a mathematician, Jun 02 '13 at 13:17

score 3 · Accepted Answer · answered Jun 02 '13 at 12:05

3

Yes, this is normalization. Consider that $V$ were obtained from the eigenvectors of $AA^T$. Let $v$ be a unit norm eigenvector for $AA^T$. Since $AA^Tv=\lambda v$, multiplying by $A^T$ on the left we obtain $A^TA(A^Tv)=\lambda(A^T v)$. Thus, $A^Tv$ is an eigenvector for $A^TA$. However it is not a unit vector: multiplication stretches it by $\lambda^{1/2}$. Dividing it by $\lambda^{1/2}$, we get a unit eigenvector for $A^TA$.

answered Jun 02 '13 at 12:05

ˈjuː.zɚ79365

3,004

Excuse me, I don't think I get it. Why the multiplication stretches it by λ^(1/2)? – avocado Jun 02 '13 at 12:26
@loganecolss Because the eigenvectors of $AA^T$ are the left singular vectors of $A$. (In Wikipedia notation, $\sigma=\lambda^{1/2}$, a singular value of $A$.) – ˈjuː.zɚ79365 Jun 02 '13 at 12:30
Finally I understand, thank you. – avocado Jun 02 '13 at 13:52

compute pca with this useful trick

1 Answers1