Prove $\operatorname{rank}A^TA=\operatorname{rank}A$ for any $A\in M_{m \times n}$

Question

How can I prove $\operatorname{rank}A^TA=\operatorname{rank}A$ for any $A\in M_{m \times n}$?

This is an exercise in my textbook associated with orthogonal projections and Gram-Schmidt process, but I am unsure how they are relevant.

Obligatory remark: This holds when $A\in\mathbb R^{m\times n}$, but not (for instance) when $A\in\mathbb C^{m\times n}$. — darij grinberg, Apr 03 '13 at 04:31
See this counter-example for the case where $A\in\mathbb C^{m\times n}$. — Sadeq Dousti, Jan 13 '16 at 22:43
The conclusion also fails for fields of nonzero characteristic (finite fields in particular). See the comment by @Member below on the Accepted Answer, identifying the crucial step that $(Ax)^T(Ax) = 0$ implies $Ax=0$. — hardmath, Jun 08 '17 at 17:44
Does this answer your question? If $\operatorname{rank}(A)=m$, can we say anything about $\operatorname{rank}(AA^t)$? — user26857, Jan 24 '24 at 20:32

score 204 · Accepted Answer · edited May 06 '15 at 10:40

204

Let $\mathbf{x} \in N(A)$ where $N(A)$ is the null space of $A$.

So, $$\begin{align} A\mathbf{x} &=\mathbf{0} \\\implies A^TA\mathbf{x} &=\mathbf{0} \\\implies \mathbf{x} &\in N(A^TA) \end{align}$$ Hence $N(A) \subseteq N(A^TA)$.

Again let $\mathbf{x} \in N(A^TA)$

So, $$\begin{align} A^TA\mathbf{x} &=\mathbf{0} \\\implies \mathbf{x}^TA^TA\mathbf{x} &=\mathbf{0} \\\implies (A\mathbf{x})^T(A\mathbf{x})&=\mathbf{0} \\\implies A\mathbf{x}&=\mathbf{0}\\\implies \mathbf{x} &\in N(A) \end{align}$$ Hence $N(A^TA) \subseteq N(A)$.

Therefore $$\begin{align} N(A^TA) &= N(A)\\ \implies \dim(N(A^TA)) &= \dim(N(A))\\ \implies \text{rank}(A^TA) &= \text{rank}(A)\end{align}$$

edited May 06 '15 at 10:40

Empiricist

7,933

answered Apr 03 '13 at 10:10

A.D

6,400
1
20
43

19

How does $(Ax)^T (Ax)=0$ implys $Ax=0$? – Dec 20 '15 at 21:56
20

@AnuragJain note that for any vector $y$, $y^Ty = |y|^2$. Also, note that this question was asked two years ago, so you're unlikely to catch the attention of the original asker. – Ben Grossmann Dec 20 '15 at 21:58
12

The last step $\dim(N(A^TA)) = \dim(N(A)) \implies \text{rank}(A^TA) = \text{rank}(A)$ is due to rank-nullity theorem. – chibicode Apr 02 '18 at 17:57
Hi @A.D how it will work for $A^$ , i mean how to show that $rank(A)=rank(AA^)=rank(A^*A)$, is it the same? – user652838 Dec 03 '20 at 16:36
@user726608 Same manner – Unknown Sep 28 '21 at 12:22
This also implies $\text{row}(A^TA) = \text{row}(A)$, correct? – Dude156 Sep 22 '23 at 01:05

score 24 · Answer 2 · answered Apr 03 '13 at 03:25

24

Let $r$ be the rank of $A \in \mathbb{R}^{m \times n}$. We then have the SVD of $A$ as $$A_{m \times n} = U_{m \times r} \Sigma_{r \times r} V^T_{r \times n}$$ This gives $A^TA$ as $$A^TA = V_{n \times r} \Sigma_{r \times r}^2 V^T_{r \times n}$$ which is nothing but the SVD of $A^TA$. From this it is clear that $A^TA$ also has rank $r$. In fact the singular values of $A^TA$ are nothing but the square of the singular values of $A$.

answered Apr 03 '13 at 03:25

7

Note that from Strang's textbook, it actually use the fact that, there're $r$ non-zero eigenvalues of $A^TA$ i.e. $rank(A^TA)=rank(A)$, to decide the size of $\Sigma_{r \times r}$ and prove the SVD. To avoid circular argument here it would require a different SVD proof. – Weishi Z Jan 06 '21 at 12:03
This proof does not have ill logic of circular argument. And to prove SVD of A also does not require any information of $A^TA$. – Kuo Mar 15 '24 at 16:06

score 4 · Answer 3 · edited May 26 '19 at 22:00

4

Since elementary operations do not change the rank of a matrix we have $\text{rank}(A^TA) = \text{rank}(E^TA^TAE)$, where $E$ is a multiplication of several elementary operations which make $AE = [A_1, A_2]$, where $A_1$ is a column full rank matrix with $\text{rank}(A_1) = \text{rank}(A)$.

Thus we can find a matrix $P$ such that $A_1P= A_2$ and $AE = [A_1, A_1P] = A_1[I, P]$.

Thus $\text{rank}(E^TA^TAE) = \text{rank}(A_1[I, P])^T(A_1[I, P])$. In this equation, the matrices are all of full rank and the rank equals $\text{rank}(A)$, so on a real space $\text{rank}(A^TA) = \text{rank}(A)$, completing the proof.

edited May 26 '19 at 22:00

user26857

52,094

answered Apr 02 '15 at 12:09

Xiangru Lian

367

2

I cannot decipher what is said here, but it must be wrong since it never uses that the matrices are over $\Bbb R$ (or more generally an ordered field) rather than for instance over $\Bbb C$ where the result is not true. – Marc van Leeuwen Apr 27 '16 at 14:38
1

The last theorem actually implicitly uses they are over real space. Thank you for pointing that out. I have added that prerequisite into my answer. – Xiangru Lian Apr 30 '16 at 04:47
3

An alternative simple way to see it: Matrix $A^T$ may be reduced to its reduced row-echelon form, $R$, by $PA^T=R$, where $P$ is the product of a sequence of elementary matrices. So, $A^T=P^{-1}R$ and hence $$\mathrm{rank}(A^TA)=\mathrm{rank}(P^{-1}RR^T(P^{-1})^T)=\mathrm{rank}(RR^T).$$ The result then follows easily from this, since clearly $\mathrm{rank}(RR^T)=\mathrm{rank}(R)=\mathrm{rank}(A).$ – syeh_106 Jan 14 '17 at 02:07
1

@syeh_106 Why is it that $\operatorname{rank}(RR^T)=\operatorname{rank}(R)$? I'm sorry if this is too basic. – JPYamamoto Aug 17 '20 at 19:08
2

@JPYamamoto If $R^T$ is full column rank, this is clearly true: $RR^Tx=0 \Rightarrow x^TRR^Tx=\Vert R^Tx\Vert^2=0 \Rightarrow R^Tx = 0 \Rightarrow x = 0$, i.e. $RR^T$ is still full column rank. Otherwise, $R^T= [R_1^T, 0]$, where $R_1^T$ is full column rank, and it's easily verified that $\mathrm{rank}(RR^T)=\mathrm{rank}(R_1R_1^T)=\mathrm{rank}(R_1)=\mathrm{rank}(R).$ – syeh_106 Aug 19 '20 at 04:33

score 1 · Answer 4 · answered Jan 13 '24 at 14:04

The question mentions the Gram-Schmidt process, so here's an answer using it.

Pick an orthonormal basis of $\operatorname{im} B$: $\{B v_1, \dots, B v_n\}$, using the Gram-Schmidt process. We claim $\{B^T B v_1, \dots, B^T B v_n\}$ is a basis of $\operatorname{im} B^T B$. It clearly spans $\operatorname{im} B^T B$, so we just need linear independence.

Suppose $\sum_i a_i B^T B v_i = 0$. Then for any $k$, $0 = \langle \sum_i a_i B^T B v_i, v_k \rangle = \sum_i a_i \langle B^T B v_i, v_k \rangle = \sum_i a_i \langle B v_i, B v_k \rangle = a_k$. Hence $a_k = 0$ for all $k$.

Prove $\operatorname{rank}A^TA=\operatorname{rank}A$ for any $A\in M_{m \times n}$

4 Answers4

Linked

Related