Apparently, for a projection matrix $P := X(X^TX)^{-1}X^T$, $$\text{rank}(P)= \text{rank}(X)$$ How can this property be proved?
-
Please provide additional context, which ideally explains why the question is relevant to you and our community. Some forms of context include: background and motivation, relevant definitions, source, possible strategies, your current progress, why the question is interesting or important, etc. – Physical Mathematics Sep 07 '20 at 20:22
-
Why is $X^TX$ invertible? If $X$ has rank $r$, then $X^TX$ has rank at most $r$ which can't be maximal unless the dimension of the codomain of $X$ is $r$ dimensional. – Physical Mathematics Sep 07 '20 at 20:25
-
@ Keefer: For any $A\in\mathbb{R}^{m\times n}$ we have $rank(A)=rank(A^TA)$. See this here: https://math.stackexchange.com/questions/349738/prove-operatornamerankata-operatornameranka-for-any-a-in-m-m-times-n – Matthew H. Sep 07 '20 at 20:45
-
1$X^T X$ is invertible when restricted to im$(X^T)$. – Levent Sep 07 '20 at 20:50
-
You forgot to mention that $X$ is tall and has full column rank. – Rodrigo de Azevedo Sep 07 '20 at 23:59
2 Answers
One issue is that the field is not specified though this is the Hat Matrix from stats so we infer the field is $\mathbb R$ and we should also infer that $X$ is injective (all columns linearly independent).
A nice way of proving this: observe that $P^2=P\implies P$ it is diagonalizable with all eigenvalues being 0 or 1
Alternatively if OP wants to avoid minimal polynomials we can also observe $P=P^T$ so it is diagonalizable and for eigenvector $\mathbf x$
$\lambda^2 \mathbf x = P^2\mathbf x = P\mathbf x = \lambda \mathbf x\implies \lambda^2-\lambda = 0$ i.e. $\lambda \in\{0,1\}$
Thus we have $$ \begin{align} \text{rank}\big(P\big) &=\text{trace}\big(P\big) \\ &= \text{trace}\big(X (X^TX)^{-1}X^T\big) \\ &=\text{trace}\big(X^TX (X^TX)^{-1}\big) \\ &=\text{trace}\big(I_r\big) \\ &= r \\ &=\text{rank}\big(I_r\big) \\ &=\text{rank}\big(X^TX (X^TX)^{-1}\big) \\ &=\text{rank}\big(X^TX \big) \\ &= \text{rank}\big(X \big) \\ \end{align} $$

- 11,604

- 10,034
-
Ah yes, this proof makes use of that the eigenvalues of $P$ are only $0$ or $1$ and further trace of a matrix is equal to the sum of its eigenvalues. A couple of questions: (1) Why does $P^2 = P$ imply $P$ is diagonalizable? How does this show that $P$ has linearly independent eigenvectors, a necessary and sufficient condition for the spectral decomp?
(2) Also, why is $rank(X^TX) = rank(X)$? I've used this property before, but is there a simply justification for this?
– 24n8 Sep 07 '20 at 21:17 -
any idempotent $P$ is diagonalizable over any field because it is annihilated by $x^2-x=x(x-1)$. But if you don't know what a minimal polynomial is then the expedient move is to use the fact that in your case $P=P^T$ so being real symmetric its eigenvectors form a basis. As for $\text{rank}\big(X^TX \big) = \text{rank}\big(X \big)$ over reals -- for a crude and easy proof, use SVD or Polar Decomposition. You can actually answer your own question $\text{rank}(X) = r \implies \text{rank}(X(X^TX)^{-1}X^T) $ 'only' using SVD (exercise) though using idempotence is nicer. – user8675309 Sep 07 '20 at 21:22
-
Does your proof assume $X$ is full rank? Since $(X^TX)^{-1}$ wouldn't be invertible if $X$ isn't full rank. – 24n8 Sep 07 '20 at 23:58
-
1I explicitly said "we should also infer that is injective (all columns linearly independent)" to address this... – user8675309 Sep 08 '20 at 02:42
-
Given a linear map $f:V\rightarrow W$, we can compose $f$ with a quotient map $q:W\rightarrow$im$(f)\subseteq W$. Then, it is clear that $q$ has the same rank as $f$ since im$(f)=$im$(q)$. There are many quotient maps with the property that im$(f)=$im$(q)$. Can you show that $q=X(X^TX)\mid_{im(X^T)}^{-1}X^T$ satisfies this property?

- 4,804