I've been working on implementing a Singular Value Decomposition (SVD) algorithm from scratch in Python without using the np.linalg.svd function. My goal is to understand the underlying mechanics of SVD. However, I'm encountering some discrepancies in my results compared to the standard np.linalg.svd, specifically regarding the signs of the singular vectors. Interesting enough, my approach works when A is 4x3 but not when it is 3x3 shaped.
My Goal:
Is creating a function that uses a stable and reliable mathematical approach to create an inverse or pseudo inverse to resolve a least squares equation. Limitation - I am not allowed to use any existing math functions. I will later on replace np.linalg.norm and other stuff, by my own functions.
I tried to implement the Moore Penrose inverse, but there were to many different exceptions and issues in regards to the matrix properties. According to some papers, the usage of SVD allows an easy way out.
My Approach:
I've implemented a custom QR decomposition (my_qr) and an eigenvalue/eigenvector computation (eig_custom) method, which are then utilized in calculating the SVD. Here's a simplified version of my code:
import numpy as np
def my_qr(A, standard=False):
m, n = A.shape
Q = np.eye(m)
R = np.copy(A)
if m == n: end = n-1
else: end = n
for i in range(0, end):
H = np.eye(m)
a = R[i:, i] # partial column vector
norm_a = np.linalg.norm(a) #
if a[0] < 0.0: norm_a = -norm_a
v = a / (a[0] + norm_a)
v[0] = 1.0
h = np.eye(len(a)) # H reflection
h -= (2 / np.dot(v, v)) * \
np.matmul(v[:,None], v[None,:])
H[i:, i:] = h
Q = np.matmul(Q, H)
R = np.matmul(H, R)
if standard == True:
S = np.zeros((n,n))
for i in range(n):
if R[i][i] < 0.0: S[i][i] = -1.0
else: S[i][i] = +1.0
Q = np.matmul(Q, S)
R = np.matmul(S, R)
return Q, R
def is_upper_tri(A, tol):
n = len(A)
for i in range(0,n):
for j in range(0,i):
if np.abs(A[i][j]) > tol:
return False
return True
def eig_custom(A, num_iterations: int = 100):
Ak = np.copy(A)
n = A.shape[0]
eigenvectors = np.eye(n)
for cnt in range(num_iterations):
Q, R = my_qr(Ak) #my_qr
Ak = R @ Q
eigenvectors = eigenvectors @ Q
if is_upper_tri(Ak, 1.0e-16) == True:
break
eigenvalues = np.diag(Ak)
return eigenvalues, eigenvectors
def calculU(M):
B = np.dot(M, M.T)
eigenvalues, eigenvectors = eig_custom(B)
ncols = np.argsort(eigenvalues)[::-1]
return eigenvectors[:,ncols]
def calculVt(M):
B = np.dot(M.T, M)
eigenvalues, eigenvectors = eig_custom(B)
ncols = np.argsort(eigenvalues)[::-1]
return eigenvectors[:,ncols].T
def calculSigma(M):
if (np.size(np.dot(M, M.T)) > np.size(np.dot(M.T, M))):
newM = np.dot(M.T, M)
else:
newM = np.dot(M, M.T)
eigenvalues, eigenvectors = eig_custom(newM)
eigenvalues = np.sqrt(np.abs(eigenvalues))
#Sorting in descending order as the svd function does
return eigenvalues[::-1]
def svd(A, num_iterations=1000):
# Compute eigenvectors and eigenvalues for A^TA and AA^T
eigvals_AAt, eigvecs_AAt = eig_custom(A @ A.T, num_iterations)
eigvals_AAt_ref, eigvecs_AAt_ref = np.linalg.eig(A @ A.T)
eigvals_AtA, eigvecs_AtA = eig_custom(A.T @ A, num_iterations)
eigvals_AtA_ref, eigvecs_AtA_ref = np.linalg.eig(A.T @ A)
singular_values = np.sqrt(np.abs(eigvals_AtA))
sorted_indices = np.argsort(singular_values)[::-1]
singular_values = singular_values[sorted_indices]
U = eigvecs_AAt[:, sorted_indices]
V = eigvecs_AtA[:, sorted_indices]
return U, singular_values, V.T
def main():
mat = []
for i in range(3):
mat.append(np.array([7.0, 6.0, 5.0]) - i)
A = np.array(mat, dtype=float)
#print(A)
b = np.array([10,-3, 7])
x = A @ b
print(f"Reference Matrix:\n{A}")
print(f"\n{'-'*50}\nReference SVD decomp:")
U_ref, S_ref, V_ref = np.linalg.svd(A.copy(), full_matrices=False)
print(f"\n\nU:\n{U_ref}\nS:\n{S_ref}\nV:{V_ref}\n\n")
print("Verification [U @ Sigma @ V]:")
print(U_ref @ np.diag(S_ref) @ V_ref)
print(f"\n{'-'*50}\nMy SVD decomp:")
U, S, V = svd(A.copy())
print(f"\nU:\n{U}\n\nS:\n{S}\n\nV:{V}\n\n")
print("Verification [U @ Sigma @ V]:")
print(U @ np.diag(S) @ V)
print(f"\n{'-'*50}\nDifferent custom SVD decomp:")
U, S, V = calculU(A), calculSigma(A), calculVt(A)
print(f"\nU:\n{U}\n\nS:\n{S}\n\nV:{V}\n\n")
newSigma = np.zeros(A.T.shape)
newSigma[:S.size, :S.size] = np.diag(S[::-1])
A_res = U @ newSigma.T @ V
print("Verification [U @ Sigma @ V]:")
print(A_res)
if name == "main":
main()
The complete code, along with the custom QR decomposition and eigenvalue computation, can be found attached.
Issue Encountered:
While my custom SVD implementation does produce singular values that are close to those generated by np.linalg.svd, the singular vectors (U and V matrices) often have different signs. This leads to a final reconstructed matrix that differs from the original.
The reconstructed matrix from my custom SVD differs from the original, while np.linalg.svd perfectly reconstructs it.
Reference Matrix:
[[7. 6. 5.]
[6. 5. 4.]
[5. 4. 3.]]
Reference SVD decomp:
U:
[[-0.6813193 0.60756674 0.40824829]
[-0.57017342 -0.09075023 -0.81649658]
[-0.45902754 -0.7890672 0.40824829]]
S:
[1.53898669e+01 3.89866919e-01 8.98295192e-16]
V:[[-0.6813193 -0.57017342 -0.45902754]
[-0.60756674 0.09075023 0.7890672 ]
[ 0.40824829 -0.81649658 0.40824829]]
Verification [U @ Sigma @ V]:
[[7. 6. 5.]
[6. 5. 4.]
[5. 4. 3.]]
My SVD decomp:
U:
[[ 0.6813193 -0.60756674 0.40824829]
[ 0.57017342 0.09075023 -0.81649658]
[ 0.45902754 0.7890672 0.40824829]]
S:
[1.53898669e+01 3.89866919e-01 2.26925653e-08]
V:[[ 0.6813193 0.57017342 0.45902754]
[-0.60756674 0.09075023 0.7890672 ]
[ 0.40824829 -0.81649658 0.40824829]]
Verification [U @ Sigma @ V]:
[[7.28782888 5.95700795 4.62618703]
[5.95700795 5.00642159 4.0558352 ]
[4.62618703 4.0558352 3.48548338]]
Different custom SVD decomp:
U:
[[ 0.6813193 -0.60756674 0.40824829]
[ 0.57017342 0.09075023 -0.81649658]
[ 0.45902754 0.7890672 0.40824829]]
S:
[2.26925653e-08 3.89866919e-01 1.53898669e+01]
V:[[ 0.6813193 0.57017342 0.45902754]
[-0.60756674 0.09075023 0.7890672 ]
[ 0.40824829 -0.81649658 0.40824829]]
Verification [U @ Sigma @ V]:
[[7.28782888 5.95700795 4.62618703]
[5.95700795 5.00642159 4.0558352 ]
[4.62618703 4.0558352 3.48548338]]
Questions:
- Could the discrepancies in the signs of the singular vectors be the reason for the imperfect reconstruction of the original matrix?
- Is there a specific step in the QR decomposition or the eigenvalue computation where I should pay extra attention to the signs?
- Are there any known best practices or adjustments to ensure the signs align more closely with those produced by standard libraries like NumPy?
Any insights or suggestions on how to resolve these discrepancies and improve the accuracy of my custom SVD implementation would be greatly appreciated!
Thank you!
U @ Sigma @ V
matches the original matrix. For the traditional SVD, we have $A = U\Sigma V^T$. So, for your verification, you should be calculatingU @ Sigma @ V.T
instead ofU @ Sigma @ V
. – Ben Grossmann Jan 15 '24 at 00:27Also if you take a look at the results, they are the same except the signs.
– Momo Jan 15 '24 at 11:36np.linalg.eig(A @ A.T)[1] array([[-0.6813193 , -0.60756674, 0.40824829], [-0.57017342, 0.09075023, -0.81649658], [-0.45902754, 0.7890672 , 0.40824829]]) np.linalg.svd(A.copy(), full_matrices=False)[0] array([[-0.6813193 , 0.60756674, 0.40824829], [-0.57017342, -0.09075023, -0.81649658], [-0.45902754, -0.7890672 , 0.40824829]])
– Momo Jan 15 '24 at 12:02