5

I was asked by a student and was stumped upon the very intuition of this statement. I searched on this site and found many proofs but little intuition. For example: Column Vectors orthogonal implies Row Vectors also orthogonal? and Intuition behind row vectors of orthonormal matrix being an orthonormal basis I am open to abstract linear algebra ideas but I don't think they help bring in much intuition. I am hoping for either a geometric intuition, a quick algebraic manipulation on vector components, or an intuitive explanation of the key step in the proof: $A^TA = AA^T = I$.

Edit: many comments went for the last option of the three. However, it was probably the hardest to gain any intuition with this option. I would personally prefer answers exploring the first two options, or something really really special about this last option.

anon
  • 151,657
XYSquared
  • 405
  • I'm not sure what you're looking for. $A^TA = AA^T = I$ is a quick algebraic manipulation (explicitly the first product says that the columns of $A$ are orthonormal and the second product says that the rows of $A$ are orthonormal). Abstractly, as the link points out, this is the statement that the dual map of an orthogonal transformation is orthogonal, this is the "intuition" behind it. – Osama Ghani Aug 11 '20 at 01:01
  • 1
    You're sayng we want to prove something like this: $AA^T=I\Rightarrow A^TA=I$, note that $AA^T=I\Leftrightarrow A^T=A^{-1}$, thus $A^{-1}A=I$ too. For curiosity, let's suppose $A$ have different left and right inverses, $A^{-1}_L$ and $A^{-1}_R$ such that $A^{-1}_LA=I$ and $AA^{-1}_R=I$, then (by associativity) $(A^{-1}_LA)A^{-1}_R=A^{-1}_L(AA^{-1}_R)$ but each parenthesis are $I$ thus $A^{-1}_R=A^{-1}_L$ @XYSquared – Alexey Burdin Aug 11 '20 at 01:36
  • When I hear questions asking for "intuition" I always reach for my yellow card. – Angina Seng Aug 11 '20 at 03:08
  • 2
    @AnginaSeng getting intuitions can sometimes be very hard, yet some people have done it. I always think of learning another's intuition as visiting another mathematician's treasure chest. It takes ingenuity, and generosity for people to share their intuitions here. That's part of what this site is for IMO – XYSquared Aug 11 '20 at 06:06
  • I would not expect an intuitive reason for this – partly because the obvious generalisations are false. A general rectangular matrix with orthonormal rows may not have orthonormal columns or vice versa. Even if we restrict to "square matrices", i.e. linear endomorphisms of inner product spaces, the statement is only true in finite dimensions. Your observation that $A^T A = A A^T = I$ is the key step of the proof hits the nail on the head. – Zhen Lin Aug 14 '20 at 08:56

3 Answers3

4

Invert both sides of $I=AA^\top$, to $I=(A^\top)^{-1}A^{-1}$. Multiply both sides on the left by $A^\top$, and on the right by $A$, to obtain $A^\top A=I$.

paul garrett
  • 52,465
1

Let $A$ have rows $\mathbf r_1^{\mathsf T}, \dots, \mathbf r_n^{\mathsf T}$ and columns $\mathbf c_1, \dots, \mathbf c_n$.

Suppose that $\mathbf r_1, \dots, \mathbf r_n$ are orthonormal. This means that $A \mathbf r_i = \mathbf e_i$, the $i^{\text{th}}$ standard basis vector. In particular, $A$ takes the orthonormal basis $\{\mathbf r_1, \dots, \mathbf r_n\}$ to the orthonormal basis $\{\mathbf e_1, \dots, \mathbf e_n\}$.

We can check that this means that $A$ preserves inner products in the sense that $\langle \mathbf x, \mathbf y\rangle = \langle A \mathbf x, A \mathbf y\rangle$; geometrically, this means that $A$ preserves angles and distances.

This, in turn, means that $A$ takes any orthonormal basis to another orthonormal basis, because the statement "$\{\mathbf q_1, \dots, \mathbf q_n\}$ is an orthonormal basis" is just a statement about the all inner products $\langle \mathbf q_i, \mathbf q_j\rangle$, and those are preserved by $A$.

In particular, $A$ will take the orthonormal basis $\{\mathbf e_1, \dots, \mathbf e_n\}$ to another orthonormal basis. But $A\mathbf e_i = \mathbf c_i$, so this tells us that the columns of $A$ are orthonormal.

(Technically, I just explained why orthonormal rows imply orthonormal columns, but you can go from columns to rows in the same way - you'd just have to either reason about row vectors the whole time, or talk about $A^{\mathsf T}$ instead of $A$.)

Misha Lavrov
  • 142,276
0

The left inverse of a square matrix $A$ is always a right inverse as well. This is a Linear Algebra fact that is not true for infinite-dimensional spaces.

For a finite-dimensional space $X$ and a linear $A: X \rightarrow X$, there is a minimal polynomial $m$ such that $m(A)=0$. The minimal polynomial can be normalized as a monomial $$ m(\lambda)=\lambda^m+a_{m-1}\lambda^{m-1}+\cdots+a_1\lambda+a_0. $$ The coefficient $a_0$ cannot be $0$ because that would give $$ A(A^{m-1}+a_{m-1}A^{m-2}+\cdots+a_1 I)=0, $$ which would either contradict $\mathcal{N}(A)=\{0\}$ or the minimality of the polynomial $m$. So, $a_0\ne 0$ for an invertible matrix $A$. So the minimal polynomial $m$ can always be normalized so that $a_0=1$. That gives an explicit left- and right- inverse for $A$ (and both are the same): \begin{align} I=(-A^{-m}-a_{m-1}A^{m-2}-\cdots-a_1I)A \\ =A(-A^{-m}-a_{m-1}A^{m-2}-\cdots-a_1I). \end{align}

A square matrix over a field $\mathrm{F}$ has a left inverse iff it has a right inverse and, in that case, the two inverses are the same. This is a consequence of working in a finite-dimensional setting.

An immediate consequence of this: If $A$ is an $n\times n$ matrix whose column vectors form an orthonormal basis, then $A^{T}A=I$, which then forces $AA^{T}=I$ and, hence, implies that the row vectors of $A$ also form an orthonormal basis.

Disintegrating By Parts
  • 87,459
  • 5
  • 65
  • 149