Gram Schmidt Alternative

Question

For whatever reason, my brain is pondering the (re-)invention of a Gram-Schmidt alternative for orthonormalization of a subspace basis. If we start with

v1 = \begin{matrix}7\\0\\0\end{matrix} v2 = \begin{matrix}2\\0\\3\end{matrix} v3 = \begin{matrix}4\\5\\6\end{matrix}

and begin applying Gram Schmidt to find the orthonormal vectors

q1 = \begin{matrix}1\\0\\0\end{matrix} q2 = \begin{matrix}0\\0\\1\end{matrix}

then we can use the definitions of dot product and vector length to write three equations for q1 dot q3 = 0, q2 dot q3 = 0 and length(q3) = 1 to solve for every element of q3, which turns out to be

q3 = \begin{matrix}0\\1\\0\end{matrix}

without needing to project onto the plane spanned by q1 and q2. However, this does not appear sufficient when working with proper subspaces. Consider the similar problem where

v1 = \begin{matrix}-1\\0\\1\end{matrix} v2 = \begin{matrix}-1\\1\\0\end{matrix}

We can normalize to find that

q1 = \begin{matrix}-sqrt(6)/6\\-sqrt(6)/6\\sqrt(6)/3\end{matrix}

but can only follow up with two equations using the above method. The result is a circle of unit vectors orthogonal to q1, two vectors of which intersect the plane spanned by v1 and v2. Projecting onto the plane would be the Gram Schmidt thing to do. However, as I'm picturing this circle in my mind, it seems to me that the closest vector in the circle to any possible v2 in the plane where v2 is linearly independent to v1 would always be at one of the two intersections between the circle and the plane. So if we could somehow find the equation of the circle as a function of x y and z, and treat the coordinates of v2 as a data point, least squares approximation should produce an appropriate q3 by finding the vector in the circle closest to v2. Is this idea on the right track toward formulating an alternative approach to finding a final orthogonal vector to form the basis of a subspace (in which case we might then consider how to extend it to also replace earlier steps of Gram Schmidt), and if so, how can we solve for the equation of the circle?

It appears that repeated roots cause difficulties. Reminds me of the fact that a repeated root in the characteristic polynomial for $a_{n+1}=ua_n+va_{n-1}$ causes the solutions to be $r^n$ and $nr^n$ instead of $r^n$ and $s^n$. — marty cohen, Feb 21 '18 at 18:52
Your post does end with a question mark, but asking "Am I on the right track?" would be easier to answer if your goal were a bit clearer. Presented by one example, it seems less of an alternative than a restriction to three vectors in three dimensions. — hardmath, Feb 21 '18 at 19:14

score 3 · Answer 1 · answered Sep 29 '18 at 00:09

No. You just found a slightly simpler(not necessarily shorter or easier) last step for Gram-Schmidt (that only works if your subspace is the entire space).

The reason it seems more complicated in the second example is because you actually did a projection in the first example to get the second orthonormal vector, but it seemed trivial. The dot products in the first example seem simpler than the last step of Gram Schmidt, but it involves finding the nullspace of a n-1 x n matrix for a vector in $R^n$.

Also notice that you can only do the dot products if you're projecting on a line, since then, the direction can only be two things and and its length will be set to one anyway. This is the problem you got to in the second example: you could only reduce the possible answers onto a plane, not a line. You can figure out the subspace(in this case, the plane on which your unit circle is) by setting the vectors to be the rows of a matrix and solving for the nullspace, which is the orthogonal complement.

The problem is that finding the third unit vector by least squares is literally the same idea as Gram Schmidt:

In finding the least squares solution, you essentially have a subspace (column space of $A$) and a vector $b$ not in that subspace. The idea is that if you find an error vector which is perpendicular to $A$, you can't decrease the error further by adding another vector in $A$, since the hypotenuse of a right degree triangle is the greatest side(hypotenuse = new error, long side = perpendicular error, short side = vector in A you added).

So, we say that the approximation is $Ax$ (which, by definition of a subspace, is any vector in the column space of $A$ - try expanding it) and so the error vector is $Ax-b$. To say that it's perpendicular to the column space of $A$, saying $A^T(Ax-b)=0$ is enough, since it means each of the dot products between any column of A the vector are zero.

When projecting on a subspace, the idea is that the error vector between your projection and the vector that's being projected is perpendicular to the subspace you're projecting to, since then, if you add another vector in that subspace, you'd be increasing the error. It's the same thing as a least square regression, the subspace being the subspace you're projecting onto, and the vector b being the vector you're projecting.

Introduction to Linear Algebra by Gilbert Strang, Chapter 4

Hi @atreju please can you help with this https://math.stackexchange.com/q/4396103/585488 — linker, Mar 05 '22 at 15:56

Gram Schmidt Alternative

1 Answers1