7

I am trying to solve Exercise 16 from Section 6.B of the third edition of Linear Algebra Done Right by Axler.

Suppose $ \mathbf{F} = \mathbf{C}, V $ is finite-dimensional, $ T \in \mathcal{L}(V) $, all the eigenvalues of $ T $ have absolute value less than $1$, and $ \epsilon > 0 $. Prove that there exists a positive integer $ m $ such that $ \lVert T^m v \rVert \leq \epsilon \lVert v \rVert $ for every $ v \in V $.

($V$ is an inner product space.)

I think this is a quick corollary of the implication $\rho(T) < 1 \implies \lim_{m \to \infty} \lVert T^m \rVert = 0 $. However, at this point in the textbook we have discussed neither spectral radius nor operator norm; for this reason I think Mr. Axler had another approach in mind.

The title of Section 6.B is 'Orthonormal Bases', so I tried to look for a solution which only uses basic properties of orthonormal bases. Since $V$ is a finite-dimensional complex vector space, Schur decomposition (Theorem 6.38 in the text) ensures there is an orthonormal basis of $V$ with respect to which $T$ has an upper-triangular matrix. I managed to prove the result in the special case where this matrix is diagonal and also in the special case where $\dim V = 2$, but I haven't been able to prove the general case.

Can anyone see an 'orthonormal basis approach'?

I did find this post. Unfortunately, I think there is a mistake in that answer. At least, I can't see how the equality $$ \sum_{k=1}^n \langle T^j(v), e_k \rangle e_k = \sum_{k=1}^n a_k \lambda_k^j e_k $$ follows from $\langle T^j(e_k), e_k \rangle = \lambda_k^j$.

terran
  • 1,380
  • I have given a very short proof in another thread, but that involves the notions of spectral radius, operator norm as well as the equivalence of norms in a finite-dimensional vector space. – user1551 Aug 26 '23 at 17:33
  • @user1551 It seems all the proofs I can find of this result rely on such notions. I really wonder what solution Axler intended here. I will work on it further tomorrow. – terran Aug 26 '23 at 17:51
  • @terran One approach is to see that if you have an upper triangular decomposition (such as Schur), you can choose a basis such that the resulting matrix is 'almost' diagonal. The spectral radius of an upper triangular matrix is just the $\max$ absolute value of the diagonal elements. – copper.hat Aug 27 '23 at 18:30

3 Answers3

5

Here is one possible approach that does not directly involve the notions of operator norm and equivalence of norms. First of all, in $\mathbb C^n$ equipped with the Euclidean norm, we have the following observations:

  1. The product of a diagonal matrix and a strictly upper triangular matrix is strictly upper triangular, regardless of the order of multiplication.
  2. The product of $n$ strictly upper triangular $n\times n$ matrices is always zero.
  3. For any $n\times n$ complex matrix $A$, there exists a constant $C$ (that depends on $A$ and $n$) such that $\|Ax\|\le C\|x\|$ for all $x\in\mathbb C^n$. We don’t need any topological argument to justify this. In fact, if we denote by $|A|,|x|$ the entrywise absolute values of $A$ and $x$ and denote by $e$ the all-one vector, we have $$ \|Ax\|\le\big\||A||x|\big\|\le\big\||A|\big(\|x\|e\big)\big\| \le\left(\max_i\sum_j|a_{ij}|\right)\sqrt{n}\|x\|. $$
  4. If $D\in\mathbb C^{n\times n}$ is a diagonal matrix and $x\in\mathbb C^n$, then $\|Dx\|\le\max_i|d_{ii}\|x\|$.

Now return to your question. By picking a suitable orthonormal basis, we may identify $T$ with an upper triangular matrix whose diagonal elements have moduli $<1$, $x$ with a vector in $\mathbb C^n$, and the norm $\|\cdot\|$ on $V$ with the Euclidean norm on $\mathbb C^n$.

Let $D$ and $F$ be respectively the diagonal and off-diagonal parts of the matrix. Since $T$ is upper triangular, $F$ is strictly upper triangular. Also, by assumption, we have $\rho:=\max_i|d_{ii}|<1$. Let $C>0$ be a constant such that $\|Fx\|\le C\|x\|$ for all $x\in\mathbb C^n$. It follows from the observations above that when $m\ge n$, $$ \|T^mx\|=\|(D+F)^mx\|\le\sum_{k=0}^{n-1}\binom{m}{k}C^k\rho^{m-k}\|x\|.\tag{1} $$ To illustrate, suppose $n=2$ and $m=3$. Then $DFF=FDF=FFD=F^3=0$ by observations 1 and 2. Therefore $$ \begin{aligned} \|T^3x\| &=\|(D+F)^3x\|\\ &=\|(D^3+DDF+DFD+FDD+DFF+FDF+FFD+F^3)x\|\\ &=\|(D^3+DDF+DFD+FDD)x\|\\ &\le\|D(D(Dx))\|+\|D(D(Fx))\|+\|D(F(Dx))\|+\|F(D(Dx))\|\\ &\le\rho(\rho(\rho\|x\|))+\rho(\rho(C\|x\|))+\rho( C(\rho\|x\|))+C(\rho(\rho\|x\|))\\ &=\sum_{k=0}^{1}\binom{3}{k}C^k\rho^{3-k}\|x\|.. \end{aligned} $$ Now, for any fixed integer $k\ge0$, since $\lim_{m\to\infty}\binom{m}{k}\rho^m=0$, $(1)$ gives $\lim_{m\to\infty}\|T^mx\|=0$. Hence the result follows.

user1551
  • 139,064
  • Thank you for this answer, it was really helpful (and fun to work through). Out of the three excellent answers, I think this method is the most accessible given the material covered in Sections 1-6, so I'm going to accept this one. – terran Aug 28 '23 at 16:55
3

Denote by:

  • $n$ the dimension of $V$,
  • $T=D+N$ ($D$ diagonalizable, $N^n=0,$ $ND=DN$) the Jordan-Chevalley decomposition of $T$,
  • $\lambda:=$$\|D\|$$<1$ the largest absolute value of the eigenvalues of $T$,
  • $\mu:=\max(1,\|N\|,\|N^2\|,\dots,\|N^{n-1}\|)$.

For every $m\ge n,$ $$T^m=\sum_{k=0}^{n-1}\binom mkD^{m-k}N^k$$ hence $$\|T^m\|\le\mu\sum_{k=0}^{n-1}\binom mk\lambda^{m-k}\le\mu n m^{n-1}\lambda^{m-(n-1)}.$$ When $m\to\infty,$ this upper bound tends to $0.$

Anne Bauval
  • 34,650
  • 1
    Thank you for this. The fact that $D$ and $N$ commute here makes the Jordan-Chevalley decomposition a lot easier to work with than just splitting the Schur triangulation into its diagonal and strictly upper-triangular parts. Annoyingly, Jordan form is only discussed around 80 pages after this exercise... – terran Aug 28 '23 at 16:50
2

This is not really an orthonormal basis approach, but you can certainly use the Schur decomposition to prove the result.

Suppose $T=QUQ^*$ is the Schur decomposition, with $U$ upper triangular. Since $T^n = QU^nQ^*$, it suffices to show that $U^n \to 0$. I am using the induced Euclidean norm.

Let $D(r) = \operatorname{diag}(r_1,...,r_d)$ (with $r_k >0$) and note that the $ij$ entry of $D(r)UD(r)^{-1}$ is ${r_i \over r_j} U_{ij}$. The diagonal elements are unchanged, we can choose $r_d = 1$, and the remaining $r_k$ such that the off diagonal elements are arbitrarily small.

Note that $\| \operatorname{diag}(U_{11},...,U_{dd}) \| = \max_k |U_{kk}| < 1$.

In particular, since $|U_{ii}| < 1$ and the matrix norm is continuous, there is some $r$ such that $\| D(r)UD(r)^{-1} \| < 1$, and since the induced norm is submultiplicative, we see that $\| D(r)U^nD(r)^{-1} \| \le \| D(r)UD(r)^{-1} \|^n$ and hence $U^n \to 0$.

copper.hat
  • 172,524
  • Thanks for this, I hadn't seen this idea before -- turning an upper triangular matrix into an 'almost diagonal' matrix by rescaling the basis. There is one point I'm confused by. You say you're using the induced Euclidean norm, but later write $\lVert \text{diag}(U_{11}, \ldots, U_{dd}) \rVert = \max_k \lvert U_{kk} \rvert$, which seems to be the max norm. I believe all norms are equivalent on a finite-dimensional vector space and so it suffices to show $U^n \to 0$ w.r.t the max norm, but just wanted to check. – terran Aug 28 '23 at 17:01
  • 1
    It is the induced Euclidean norm, for diagonal matrices it is just the absolute values of the eigenvalues. The $<1$ part and being submultiplicative are the crucial elements. – copper.hat Aug 28 '23 at 17:13
  • I just had a read on wiki and I understand you now! I was thinking of the "entry-wise" Euclidean norm. – terran Aug 28 '23 at 17:22