A
is matrix (m rows, n cols), each row is an object, and each cols is a feature (a dimension). Typically, I compute the pca
based on the covariance matrix, that is A'A
, A'
is the transposed matrix of A
.
Today I read a book which presents a useful trick to compute pca
, that is if n >> m
, then we can compute the eigenvectors of the matrix AA'
, which might save a lot of memory, here is the code from the book:
def pca(X):
"""
Principal Component Analysis
input: X, matrix with training data stored as flattened arrays in rows
return: projection matrix (with important dimensions first), variance
and mean.
"""
# get dimensions
num_data,dim = X.shape
# center data
mean_X = X.mean(axis=0)
X = X - mean_X
# PCA - compact trick used
M = dot(X,X.T) # covariance matrix, AA', not the A'A like usual
e,EV = linalg.eigh(M) # compute eigenvalues and eigenvectors
tmp = dot(X.T,EV).T # this is the compact trick
V = tmp[::-1] # reverse since last eigenvectors are the ones we want
S = sqrt(e)[::-1] # reverse since eigenvalues are in increasing order
for i in range(V.shape[1]):
V[:,i] /= S # What for?
# return the projection matrix, the variance and the mean
return V,S,mean_X
Now I understand the algebra behind this useful trick, but there is something confuses me, that is the for-loop
, why divide V
by S
? Normolize the V
to unit-length?