Why does $\operatorname{null}(A) = \operatorname{null}(A^TA)$, intuitively?

Question

It's easy to show that the nullspace of $A$ and the nullspace of $A^TA$ are the same.

But intuitively what does that mean? Or maybe the better question to ask first is, intuitively how does $A^TA$ relate to $A$?

See this related question for the image. This explains, may be better than "intuitively", the relation of $A$ to $AA^T$. — Dietrich Burde, Dec 07 '16 at 20:08
Sorry, I'm talking about real matrices. Though I think it should still hold for complex matrices if we interpret $^T$ as the conjugate transpose. — Bobbie D, Dec 07 '16 at 20:11
I find the question "intuitively how does $A^TA$ relate to $A$" a bit vague. What do you have in mind? — Dietrich Burde, Dec 07 '16 at 20:14
@DietrichBurde I don't know exactly. Maybe there's some simple geometric relationship that I'm unaware or something. If there's no such clear relationship between the two then feel free to just focus on the first question. — Bobbie D, Dec 07 '16 at 20:15

Ben Grossmann · Accepted Answer · 2016-12-07T20:23:29.547

The most intuition-friendly method is to compare the nullspaces directly. Remember that if $Ax = b$ is a (possibly overdetermined) system of equations, then $A^TAx = A^Tb$ describes the least-squares solution to that system of equations. That is, any solution to $A^TAx = A^Tb$ minimizes $\|Ax - b\|$.

Along those lines: if $x$ is in the nullspace of $A$, then it is a solution to $A^TAx = A^T0$, which means it is a least squares solution to $Ax = 0$. So, $x$ minimizes $\|Ax - 0\| = \|Ax\|$. However, clearly the minimum of $\|Ax\|$ is $0$, so it must be that $Ax = 0$. So, if $x$ is in the nullspace of $A^TA$, then it must be in the nullspace of $A$.

There are some nice descriptions of $A^TA$ itself. In particular, polar decomposition tells us that $A = U\sqrt{A^TA}$ for some matrix $U$ satisfying $U^TU = I$. Since $\sqrt{A^TA}$ is a positive definite matrix, it is a "pure squish/stretch" of space along perpendicular axes. Since $U^TU = I$, $U$ is a mapping into (possibly higher dimensional) space the rotates or reflects, but doesn't distort.

What we get out of all this is that $A^TA$ encodes all of the streching/squishing that $A$ does. If a vector $x$ is in the kernel of $A$, then the corresponding axis is "squished to $0$", which means that $A^TAx$ must also be zero. Conversely, if $A^TAx = 0$, then $A$ squishes $x$ to zero, which means that $Ax = 0$.

+1. I think our answers complement each other pretty nicely. — Ian, Dec 07 '16 at 20:26
Thank you! This was an incredible perspective on OLS. I wonder if there is any formal "squish/stretch" category, or Wikipedia entry... — Antoni Parellada, Dec 07 '16 at 22:43
@Ian I agree. The the "4 subspaces" perspective is important to emphasize — Ben Grossmann, Dec 07 '16 at 23:49

Ian · Answer 2 · 2016-12-07T20:35:24.603

There are a bunch of different answers to this question. My preference is based on the "four fundamental subspaces" which are the focus of chapter 2 of Gilbert Strang's linear algebra book. Here the relevant relationship is that the null space of $A^T$ is the orthogonal complement of the column space of $A$. This is because the equation $Ax=0$ can be understood as "the vector $x$ is orthogonal to each row of $A$", and the rows of $A$ are the columns of $A^T$.

Once you understand that, it's straightforward: to have $A^T A x=0$ you have to have $Ax$ in the null space of $A^T$. But $Ax$ is in the column space of $A$, and the only vector in common between the column space of $A$ and the null space of $A^T$ is the zero vector. So you must have $Ax=0$.

A shorter but maybe less enlightening answer:

$Ax=0 \Rightarrow A^T Ax=A^T 0 = 0$.
$A^T A x = 0 \Rightarrow x^T A^T A x = \| Ax \|^2 = 0 \Rightarrow \| Ax \|=0 \Rightarrow Ax=0$.

That first bullet is trivial (it would work if you replaced $A^T$ with any $B$), but that second bullet has some geometric content to it.

"null space of $A^T$ is the orthogonal complement of the null space of $A$." Do you mean the column space of $A$? — Bobbie D, Dec 07 '16 at 20:29
@BobbieD Yes, thanks for the correction. Thankfully it seems the following sentence clarified my intended meaning. — Ian, Dec 07 '16 at 20:30

Why does $\operatorname{null}(A) = \operatorname{null}(A^TA)$, intuitively?

2 Answers2

Linked