0

I'm reading Anton and Rorres, and there is a part of a proof to a theorem I cannot reconcile. First I will state the theorem:

If $A$ is an n x n matrix, then the following are equivalent.
a) $A$ is orthogonally diagonisable.
b) $A$ has an orthonormal set of n eigenvectors.
c) $A$ is a symmetric matrix.

It is followed by proofs. I will only state the one I'm having trouble with:
(b)=>(c) Assume that $A$ has an orthonormal set of eigenvectors $\{p_1, p_2,...,p_n\}$. As shown in proof of Theorem 7.2.1, the matrix $P$ with these eigenvectors as columns diagonolizes A. Since these eigenvectors are orthonormal, $P$ is orthogonal and thus orthogonally diagnolizes $A$.

Now the part that I do not understand is the last sentence:

Since these eigenvectors are orthonormal, $P$ is orthogonal and thus ....

I'm just realising something, maybe, as I'm writing this question because I'm researching as I write.
So this sentence is talking about column vectors being orthonormal, but a condition for an orthogonal matrix is that it must have row vectors orthonormal also, and this is what I don't understand.

I just did a couple of examples and it looks to be the way it is in regard to this following question so here goes:

If have an n x n matrix that has its columns making an orthonormal set, is this orthonormal set of column vectors also an orthonormal set of row vectors by default?

Another way to ask is, are the rows of this matrix also an orthonormal set necessarily?

Thanks.

Bucephalus
  • 1,386
  • I think i'm looking for a proof that any n x n matrix with orthonormal columns is orthogonal. – Bucephalus Sep 25 '17 at 13:49
  • 1
    Let's say the matrix is $V=(v_1,v_2,\cdots, v_n)$ with $v_1,\cdots, v_n$ as orthonormal columns. Now $(V^TV){ij} =\sum{k}v^T_{ik}v_{kj}$, which is nothing but $v_{i}^Tv_j$. Since $v_{i}$'s are orthonormal, we get $(V^TV){ij}=\delta{ij}$. Thus, $V$ is orthonormal. – sm10 Sep 25 '17 at 14:28

1 Answers1

2

First check the definition of orthogonal matrix with your notion of orthonormal vectors. Suppose that $P$ has orthonormal vectors. Then, compute the product $P^\top\,P$: if $A= P^\top\,P$ this means $$A_{ij}=\sum_k P^\top_{ik}\,P_{kj}=\sum_k P_{ki}\,P_{kj}$$ but this is simply the scalar product of the column vectors $P_i$, $P_j$. Since the columns of $P$ form an orthonormal set, we have $A_{ij}=\delta_{ij}$, in other words, $A=I$.

So if the columns of $P$ form an orthonormal set, then $P$ is orthogonal, meaning $P^\top\,P=I$. But this also means that $P\,P^\top=I$, see e.g. If $AB = I$ then $BA = I$

In other words, $P^\top$ is also orthogonal, so the columns of $P^\top$ form an orthonormal set, so the rows of $P$ form an orthornomal set.

Miguel
  • 3,265
  • I'm really not sure if you have answered my question or not. Maybe you have, but it is not obvious to me. If you are showing to me that an orthogonal matrix has orthonormal columns and rows, this I already understand. How do you make the jump from any matrix with orthonormal columns to it being necessarily orthogonal? @Miguel – Bucephalus Sep 25 '17 at 13:42
  • So for example, we are starting with an n x n matrix with orthonormal columns, i.e. not starting with an orthogonal matrix. – Bucephalus Sep 25 '17 at 13:44
  • @Bucephalus It is the same. Edited to explain this point. – Miguel Sep 25 '17 at 13:50
  • I understand your reasoning a little better now. @Miguel. Thankyou. – Bucephalus Sep 25 '17 at 14:06