1

In connection to this post (see "Concept Three"), and one of the answers, I reviewed the usual set up of orthogonal projections: namely, we have a vector $\bf a$ along the line we are orthogonally projecting on some other vector $\bf b$, and then we use the zero dot product of the difference, $\bf b -\lambda a$ with $\bf a$ to derive the formula $\bf \frac{aa^\top}{a\top a}$. The denominator is a scalar, and it is $1$ if $\vert a \vert=1$. We move on to matrices and we have $\bf A(A'A)^{-1}A'$. Now $\bf (A'A)^{-1}$ is no longer a scalar, but it kindly goes away if $\bf A$ is composed of ortho-normal vectors of the subspace we are projecting on, leaving the beautiful and simple $\bf AA^\top$ form.

But can we drop the "orthogonal" from (orthogonal) projection, for example, or introduce some other caveat, and still be able to freely proclaim that any matrix of the form $\bf XX^\top$ is a projection matrix?

I see for instance something along the lines of "... provided $\bf A$ is invertible... $\left(\bf AA^\top\right)^2= \bf A\left(A^\top A\right) A^\top$ getting in the way of idempotence, unless the columns of $\bf A$ are orthonormal. But does this close the case?

1 Answers1

4

If you assume $X$ is full column-rank (i.e., $X^TX$ is invertible) then the condition $XX^T$ is a projection matrix implies the columns of $X$ are orthonormal.

Since the column space $XX^T$ and $X$ are the same, if $XX^T$ is a projection matrix it must project to the column space of $X$ so we must have in particular $(XX^T)X = X$ multiply both sides by $X^T$ to get $X^TXX^TX = X^TX$ and then multiplying both sides by $(X^TX)^{-1}$ we get $X^TX=I$ so the columns of $X$ are orthonormal.

Finally note that if $XX^T$ is a projection matrix it must an orthogonal projector since $XX^T$ is symmetric.

  • Do you have a reference, or a quick reminder as to why "the column space $XX^T$ and $X$ are the same"? And can we say, then that $AA^T$ is only a projection matrix provided that the column space of $A$ is composed of orthonormal vectors? – Antoni Parellada Sep 09 '16 at 04:07
  • 1
    This is a standard result. Let $n$ be the number of columns in $X^T$ and $XX^T$. Now let $u$ be an element in the null space of $XX^T$ then $XX^Tu=0$ from which it follows $u^TXX^Tu=0$ i.e., $|X^Tu|^2 = 0$ so $u$ lies in the null space of $X^T$. Clearly every $u$ in the null space of $X^T$ lies in the null space of $XX^T$, so the dimension of null space of $X^T$ and $XX^T$ is the same, say, $r$. From the rank + nullity theorem $\texttt{rank}(XX^T) + r = n$ and $\texttt{rank}(X^T) + r = n$ which implies $\texttt{rank}(XX^T) = \texttt{rank}(X^T).$ – Arin Chaudhuri Sep 09 '16 at 04:17
  • $\text{rank}(XX^T) = \text{rank}(X^T)$, but the original issue was $\text{col space}(XX^T) = \text{col space}(X)$. How do you transition? And also in your last comment, do you mean "independent" and "orthonormal"? – Antoni Parellada Sep 09 '16 at 04:26
  • 1
    What you can say is this: if $A$ is a matrix with independent columns then $AA^T$ is a projection matrix iff columns of $A$ are orthonormal. – Arin Chaudhuri Sep 09 '16 at 04:26
  • $\texttt{rank}(X^T) = \texttt{rank}(X)$ and clearly column space of $XX^T$ is contained in the column space of $X$, this inclusion along with the equality of dimension leads to the result. – Arin Chaudhuri Sep 09 '16 at 04:27
  • why is it clear? Rank implies the same number of pivots or independent vector components, but not the same column vectors... What am I missing? – Antoni Parellada Sep 09 '16 at 04:28
  • 1
    A point in column space of $X$ is of the form $Xu$ for some $u$ and a point in column space of $XX^T$ is of the form $XX^Tv$ for some $v$ i.e., $X(X^Tv)$. – Arin Chaudhuri Sep 09 '16 at 04:31