Prove that the eigenvectors of a square real matrix A are orthogonal if and only if ${A^T}A=A{A^T}$

Question

Note that this is more general than the usual orthogonal matrix has orthogonal eigenvectors or symmetric matrix has orthogonal eigenvectors in that $A$ need not be orthogonal or symmetric, just square. Also this proof requires both directions ($\Leftrightarrow$).

I tried using the fact that a symmetric matrix has orthogonal eigenvectors by showing that ${A^T}A$ is a symmetric matrix and then showing that the eigenvectors of the original matrix A ($v_1$ and $v_2$) or some multiplication thereof with a matrix are eigenvectors of ${A^T}A$. To show the reverse direction is then easy.

We know that an eigenvalue of a square matrix is also an eigenvalue of its transpose and that an eigenvalue of a product of square matrices (A.B) is also an eigenvalue of the reverse product (B.A). The eigenvalues of real symmetric matrices are all real. Maybe this is not the right path, feel free to suggest another method.

This statement is mentioned page 37 of Gilbert Strang's book, Linear Algebra and Learning from Data.

After setting up the complete answer: it is also different from this in that the proof requires both directions (and here we only consider the real matrix $A$).

"that an eigenvalue of a product of square matrices is also an eigenvalue of the commutative product" -- hmm? What is "the commutative product"? — Torsten Schoeneberg, Dec 14 '22 at 17:13
If $\lambda$ is an eigenvalue of A.B then $\lambda$ is also an eigenvalue of B.A where B and A are square matrices — crogg01, Dec 14 '22 at 17:14
If $A$ has no real eigenvectors, then do we say "the eigenvectors of $A$ are orthogonal"? Or do we consider complex eigenvectors when we say that? — GEdgar, Dec 14 '22 at 17:17
Your statement (or Strang's) is slightly vague. You are assuming that $A$ is diagonalizable, i.e., that there is a basis (indeed, an orthonormal basis) consisting of eigenvectors. — Ted Shifrin, Dec 14 '22 at 17:17
The assumption likely made in the book is that A is full rank, it is for sure square and real. it can however have complex eigenvalues. — crogg01, Dec 14 '22 at 17:22
If it has complex eigenvalues, then it will have complex (non-real) eigenvectors. What does orthogonality now mean? Strang's style is very alluring, but sometimes his poetic vagueness can be annoying. — Ted Shifrin, Dec 14 '22 at 17:27
In the book he defines it as the dot product of complex conjugate of the first vector times the second, i.e. ${\bar{x}^T}.y = 0$ where both x and y are vectors in the complex plane — crogg01, Dec 14 '22 at 17:35

Balaji sb · Answer 1 · 2022-12-16T04:35:58.767

1

Let $V_j$ be eigenspace corresponding to $\lambda_j$ for the matrix A. We now have for $v_j \in V_j$, $A^TA v_j = \lambda_j A^T v_j = A A^T v_j$. Hence $A^T v_j \in V_j$. Hence $A^T V_j \subseteq V_j$. Now restrict $A,A^T$ to $V_j$. $B_j = A|_{V_j}$ and $B_j' = A^T |_{V_j}$. Now, Let $W_j$ be the matrix with orthogonal basis for $V_j$ (assuming an orthogonal basis exists). We have $W_i^T A W_i = \lambda_i I = \lambda_i I^T = (W_i^T A W_i)^T = W_i^T A^T W_i = B'_i$. Now $\lambda_i W_j^T W_i = W_j^T (A W_i) = (W_j^T A) W_i = \lambda_j W_j^T W_i$. Hence $W_j^T W_i =0$. Hence the eigenspaces $W_i$ and $W_j$ are orthogonal.

Note the assumption that orthogonal basis $W_j$ exists for $V_j$. This is the only assumption. This is true at least for real eigenvalues. So technically I proved that eigenspaces corresponding to real eigenvalues are orthogonal.

edited Dec 16 '22 at 04:35

answered Dec 16 '22 at 03:47

Balaji sb

4,357

Can you explain the transition ${\lambda_i}{W_j^T}W_i=({W_j^T}A)W_i$? If the answer is that $W_j$ is the the orthogonal basis for $V_j$, how do you know that basis is an eigenvector associated with $\lambda_j$ ($({W_j^T}A)W_i = {\lambda_j}{W_j^T}W_i$)? – crogg01 Dec 16 '22 at 04:30
$\lambda_i W_j^T W_i = W_j^T (AW_i) = (W_j^T A) W_i $. $W_i$ is assumed to be an orthogonal basis of $V_i$ which exists at least for real eigenvalues. – Balaji sb Dec 16 '22 at 04:34

score 1 · Answer 2 · answered Dec 17 '22 at 04:32

We have a Jordan Normal form.

$$A = SJS^{-1}$$

$$A^TA = (S^{-T}J^TS^T) (SJS^{-1})$$

$$AA^T = (SJS^{-1}) (S^{-T}J^TS^T)$$

The eigenvectors being orthogonal means $$SS^T = I$$

This reduces the expressions above to:

$$A^TA = (S^{-T}J^TJS^{-1})$$

$$AA^T = (SJJ^TS^T)$$

Now $J^TJ = JJ^T$ will hold for all Jordan matrices.

But $SS^T = I$ is equivalent to $S = S^{-T}$ and $S^T = S^{-1}$

and we see that we can pairwisely finish the edge factors by identification.

Now this only shows that if $SS^T= I$ holds, then $AA^T=A^TA$.

We also need to show the other way around.

score 1 · Accepted Answer · answered Dec 17 '22 at 16:44

Decided to write up a detailed answer, that does not rely on any special theorems, and only relies on basic linear algebra properties, and some simple properties of the characteristic polynomial. The goal is to prove that $A$ is unitarily diagonalizale iff $A^*A = AA^*$. I've not seen many proofs like this, so here it is.

Let $A^*A = AA^*$ for a complex matrix $A\in M_n$. Such a matrix is called normal, and the properties discussed here generalize naturally to real normal matrices, in which case we replace the conjugate transpose $^*$ by $^T$. Then:

$\|Ax\| = \|A^*x\|$.

Proof: $\|Ax\|^2 = x^*A^*Ax = x^*AA^*x = \|A^*x\|^2$.

$N(A) = N(A^*)$

Proof: If $x\in N(A)$, then $0 = \|Ax\| = \|A^*x\|\implies x\in N(A^*)$.

$R(A) \bot N(A)$ and $R(A)=R(A^*)$.

Proof: If $y \in R(A), x\in N(A)$, then $x^*y = x^*A\xi = 0$, since $x\in N(A^*)$. $R(A)=R(A^*)$ is the corollary.

Let $\lambda\in\mathbb{C}$, then $A + \lambda I$ is normal:

Proof: $(A + \lambda I)^*(A + \lambda I) = A^*A + \overline{\lambda} A+\lambda A^* + |\lambda|^2 I = (A + \lambda I)(A + \lambda I)^*$

Let $x$ be an eigenvector of $A$, i.e. $Ax = \lambda x$, then $x^*A = \lambda A$.

Proof: $N(A - \lambda I) = N(A^* - \overline{\lambda} I)$ from 2. and 4.

Let $A$ be nonzero and singular, then $U^*AU = \begin{pmatrix}B & 0\\0 & 0_{n-r}\end{pmatrix} = B \oplus 0_{n-r}$ for some unitary matrix $U$, nonsingular $B\in M_r$, and $r = \DeclareMathOperator{\rank}{rank}\rank(A)=\rank(B)$.

Proof: Let $U = \begin{pmatrix}U_1 & U_2\end{pmatrix}$ be unitary, such that the columns of $U_1$ form a basis for $R(A)$ and the columns of $U_2$ form a basis for $N(A)$. Then using 2. and 3. and a calculation it is easy to verify that $U^*AU = B\oplus 0_{n-r}$. $B\in M_r$ implies that it is nonsingular and $$r = \rank(B) = \rank(B \oplus 0_{n-r}) = \rank(U^*AU) = \rank(A).$$

Let $\lambda$ be an eigenvalue of $A$ with algebraic multiplicity $\nu$, then $\dim N(A-\lambda I) = \nu$.

Proof: From 6. we know that $U^*(A - \lambda I)U = B\oplus 0_{n-r}$, thus $U^*AU = (B + \lambda I_r)\oplus \lambda I_{n-r} = C$, where $r = \rank(A - \lambda I)$. As such, the characteristic polynomial of $A$, $p_A(t) = p_C(t)$, the char. polynomial of $C$. But $p_C(t) = (\lambda - t)^{n-r}p_B(t - \lambda)$, and since $B$ is nonsingular, it follows that $\nu = n-r=\dim N(A - \lambda I)$.

If $\lambda\neq\mu$ are distinct eigenvalues of $A$, then $N(A - \lambda I)\bot N(A - \mu I)$.

Proof: Let $x,y$ be eigenvectors of $A$, corresponding to $\lambda,\mu$, respectively. Then using 5. $\lambda x^*y = x^*Ay = \mu x^*y \implies (\lambda - \mu)x^*y = 0\implies x^*y = 0$.

If $\{\lambda_i\}$ are the $k$ distinct eigenvalues of $A$, then $\mathbb{C}^n = N(A - \lambda_1 I)\oplus\cdots\oplus N(A - \lambda_k I)$. In other words, if $U\in M_n$ has eigenvectors of $A$ as its columns, then they form an orthonormal basis for $\mathbb{C}^n$ and $U$ unitarily diagonalizes $A$.

Proof: If $\nu_i$ is the algebraic multiplicity of $\lambda_i$, then we know that $n = \nu_1+\cdots+\nu_k$. This, combined with 7. and 8. gives the final result.

Now for the $\implies$ direction, if $A$ has $n$ orthonormal eigenvectors, then $A^*A = AA^*$.

Proof: Let $U\in M_n$ have the eigenvectors of $A$ as its columns, then it is unitary, $A = U\operatorname{diag}(\lambda_1,\ldots,\lambda_n)U^* = U\Lambda U^*$, and thus $$A^*A = U\overline{\Lambda} U^*U\Lambda U^* = U\Lambda \overline{\Lambda} U^* = AA^*.$$

crogg01 · Answer 4 · 2022-12-17T17:21:51.493

In one direction ($\Rightarrow$):

Given that the eigenvectors of $A$ are orthogonal $AQ = Q{\Lambda} \Rightarrow A = Q{\Lambda}{Q^{-1}} = Q{\Lambda}{Q^*}$. Even though this equality holds for every symmetric matrix, it does not mean that A is symmetric. Then $A{A^T} = Q{\Lambda}{Q^*}Q{\Lambda}{Q^*} = Q{\Lambda^2}{Q^*} = {A^T}A$

In the other direction (I got the answer from here from @V.S.e.H) ($\Leftarrow$):

Given that $A$ is a normal matrix ($AA^T=A^TA$) with $v_i$ an eigenvector of $A$ with eigenvalue $\lambda_i$ then $Av_i=\lambda_iv_i \Rightarrow (A-\lambda_iI)v_i = 0$ means that all eigenvectors of $A$ are in the null space of $A-\lambda_iI$.

$A-\lambda_iI$ is a normal matrix (easy to prove by working out the algebra and keeping in mind that $\lambda_i$ is a scalar and that $A$ is normal too). If $B$ is a normal matrix then the null space of $B$ equals the null space of $B^T$.

Quick proof: if $x$ is in the null space of $B$ then $B^TBx=0=BB^Tx$, which means that $x$ is also in the null space of $B^T$ and the same argument can be made starting from $x$ in the $B^T$ null space.

This means that all eigenvectors of A are also in the null space of $A^T-\lambda_iI$ with $(A^T-\lambda_iI)v_i=0$. Therefore $Av_i=\lambda_iv_i$ and also $A^Tv_i=\lambda_iv_i$ for all $\lambda$ and $v$.

If we then take the scalar $v_i^TA^Tv_j$ where $i \neq j$ evaluating the first or last two factors first gives us the equality $\lambda_i v_i^T v_j = \lambda_j v_i^T v_j$ with $\lambda_i \neq \lambda_j$ means the eigenvectors are orthogonal.

The last addition is not correct ;) If $A$ is real, $AA^T$ has nonnegative eigenvalues, since $u^TAA^Tu=||A^Tu||^2\ge0$ for all real $u$. $AA^T$ is symmetric semi-definite positive. (but not all symmetric matrices are semi-definite positive) — Jean-Claude Arbaut, Dec 16 '22 at 01:29
$A$ may be real square with complex eigenvalues, but $AA^T$ is anyway symmetric semi-definite positive. You might be confusing with the eigenvalues of $A^2$, which are the squares of the eigenvalues of $A$ (and thus may be negative or even complex). — Jean-Claude Arbaut, Dec 16 '22 at 01:38
It also holds if $A$ is non-square. This only depends on $A$ being real, as $v^Tv=||v||^2$ for all vector $v$, and $A^Tu=v$ is a vector. — Jean-Claude Arbaut, Dec 16 '22 at 06:15
you wrote "$A = Q{\Lambda}{Q^{-1}} = Q{\Lambda}{Q^T}$. Even though this equality holds for every symmetric matrix, it does not mean that A is symmetric" but $\big(Q{\Lambda}{Q^T}\big)^T=Q{\Lambda}{Q^T}$ so it quite literally does mean $A$ is symmetric. The issue as you say in the above comment seems to be that you think "Q is an orthogonal matrix" which need not be true -- in general $Q\in U_n(\mathbb C)$ and you should write $A = Q{\Lambda}{Q^{-1}} = Q{\Lambda}{Q^*}$ — user8675309, Dec 17 '22 at 17:11

Prove that the eigenvectors of a square real matrix A are orthogonal if and only if ${A^T}A=A{A^T}$

4 Answers4