Prove that $A^T \cdot A$ equal $I$ for an orthonormal matrix $A$ directly using matrix multiplication

Question

Apparently it is always true that $A \cdot A^T=A^T \cdot A=I$ for an orthonormal matrix $A$. I know this can be proven using theorems regarding the rank and invertibility of matrices, but I would like to show it directly with matrix multiplication.

Suppose that $A = \begin{bmatrix} u_1 &v _1 \\ u_2 &v_2 \\ \end{bmatrix} $ is an orthonormal matrix. Then we know that $\begin{bmatrix} u_1\\u_2 \end{bmatrix}^T \begin{bmatrix} u_1\\u_2 \end{bmatrix}=1$ and $\begin{bmatrix} v_1\\v_2 \end{bmatrix}^T \begin{bmatrix} v_1\\v_2 \end{bmatrix}=1$, and also that $\begin{bmatrix} u_1\\u_2 \end{bmatrix}^T \begin{bmatrix} v_1\\v_2 \end{bmatrix}=0$

However, we must have that $A \cdot A^T = \begin{bmatrix} u_1^2+v_1^2 & u_1u_2+v_1v_2 \\ u_1u_2+v_1v_2 & u_2^2+v_2^2 \\ \end{bmatrix} $, and I do not see why $u_1^2+v_1^2 =1$ or why $u_1u_2+v_1v_2=0.$ It seems like $A \cdot A^T = I$ is not necessarily true.

Where is my mistake?

@Riemann'sPointyNose I am now yes. (There was a typo where my transposes were switched around. It is fixed now and my calculations seem to be correct) — Zack Helms, Jan 23 '21 at 03:33
The definition of an orthonormal matrix states that its columns and its rows are orthonormal vectors. — VHarisop, Jan 23 '21 at 03:38
@VHarisop Does that mean in this case $[u_1 \ v_1]$ and $[u_2 \ v_2]$ are unit vectors orthogonal to each other? Also, does orthonormal columns imply rows and vice versa, or can we have one without necessarily having the other? — Zack Helms, Jan 23 '21 at 04:44
Yes, these are unit vectors orthogonal to each other. For your other question: recall that orthonormal columns imply orthonormal rows and vice versa when your matrix is square, but not otherwise. In your case you have a square matrix. — VHarisop, Jan 24 '21 at 05:10
I assume by “showing” you don’t mean prove… proof be examples are… Not proofs. We knew that. — user10101, Oct 28 '21 at 03:48

Parcly Taxel · Answer 1 · 2021-01-23T06:05:11.020

1

Per VHarisop's comment, orthonormal columns imply orthonormal rows, implying the extra relations $$\begin{bmatrix}u_1\\v_1\end{bmatrix}^T\begin{bmatrix}u_1\\v_1\end{bmatrix}=1$$ and so on. This should allow you to get $I$.

edited Jan 23 '21 at 06:05

answered Jan 23 '21 at 03:33

Parcly Taxel

103,344

Sorry about that, thank you for pointing it out. Do you have an answer to my actual question though? (Don't mean to sound sarcastic or snarky, just asking, no worries if not.) – Zack Helms Jan 23 '21 at 03:35
@ZackHelms Did you expand the vector products and compare with what you have in the matrix $AA^T$? – Parcly Taxel Jan 23 '21 at 03:38
Yep but I don't think they are equal? – Zack Helms Jan 23 '21 at 05:59
@ZackHelms See edit. – Parcly Taxel Jan 23 '21 at 06:06

score 1 · Accepted Answer · answered Jan 24 '21 at 17:30

I think that the answers/responses given do not exactly answer the question asked. The question seems to have been asked specifically to try to build a concrete understanding of why "orthonormal columns implies orthonormal rows". While this desire probably comes from noble intentions, it is also likely built upon the assumption that the result is obvious-- even without using the powerful tools of linear algebra.

However, even in the 2x2 case (as originally asked) if we tie our hands behind our backs and restrict ourselves only to tools which came before linear algebra it turns out to be a rather complicated thing to show. But, I'll include an uglier and weaker answer than those already provided, to emphasize the value of learning linear algebra and how non-trivial the results from linear algebra are. Even "simple" ones.

To answer the question:

Fix orthonormal vectors $u,v \in \mathbb{R}^{2}$. In particular, $|u| = |v| = 1$. So, we know that there exists $\theta, \alpha$ so that $u = (\cos(\theta), \sin(\theta))$ and $v= (\cos(\alpha), \sin(\alpha))$. Moreover, since $u \cdot v = 0$, without loss of generality (by replacing $u$ with $-u$ if necessary) we also know that $\theta = \alpha \pm \pi/2$.

Using the identity $\cos(x \pm \pi/2) = \mp \sin(x)$ we deduce that $u_{1}^{2} + v_{1}^{2} = (\mp \sin(\alpha))^{2} + \cos^{2}(\alpha) = 1$ as desired. One can show identically that $u_{2}^{2} + v_{2}^{2} =1$.

Since $\theta = \alpha \pm \pi/2$ we have $\alpha = \theta \mp \pi/2$. So, $\cos(\alpha) = (\pm \sin(\theta))$. Hence, \begin{align*} u_{1} u_{2} + v_{1} v_{2} &= \cos(\theta) \sin(\theta) + \cos(\alpha) \sin(\alpha) \\ & = (\mp \sin(\alpha)) \sin(\theta) + \cos(\alpha) \sin(\alpha) \\ & = \sin(\alpha) \left( \cos(\alpha) \mp \sin(\theta) \right) \\ & = 0, \end{align*} painfully verifying that $A A^{T} =I$.

Good luck generalizing this to any higher-dimensional cases.

score 0 · Answer 3 · answered Jan 24 '21 at 17:40

Assume that the columns of $A$ are orthonormal, meaning $$\sum_{i=1}^n A_{ij}A_{ik} = \delta_{jk}$$ for all $1 \le j,k \le n$.

Then for all $1 \le j,k \le n$ we have $$(A^TA)_{jk}= \sum_{i=1}^n (A^T)_{ji}A_{ik}= \sum_{i=1}^n A_{ij}A_{ik} = \delta_{jk} = I_{jk}$$ so $A^TA = I$. By general matrix theory this implies $AA^T = I$ as well, so for all $1 \le j ,k \le 1$ we have $$\delta_{jk}=I_{jk} = (AA^T)_{jk}= \sum_{i=1}^n A_{ji}(A^T)_{ik}= \sum_{i=1}^n A_{ji}A_{ki}$$ meaning that the rows are orthonormal.

Prove that $A^T \cdot A$ equal $I$ for an orthonormal matrix $A$ directly using matrix multiplication

3 Answers3