Proof of Cayley-Hamilton using Krylov subspaces

Question

I came up with another proof of the Cayley-Hamilton Theorem. Is this new? The proof is by induction over the dimension of the underlying vector space.

Let $v \in \mathbb F^n \setminus \{0\}$. Consider the Krylov subspaces $$ K_j = \text{span} \{v, Av, \dots, A^{j-1} v\} .$$ Let $$j_0 = \min\{j \ge 1 : K_j = K_{j+1}\} .$$

Case 1: $j_0 < n$. Then $K_{j_0}$ is an invariant subspace for $A$, so with respect to a basis whose first $j_0$ elements are in $K_{j_0}$, the matrix is a block upper triangular matrix. Now the result follows by the inductive hypothesis on each of the diagonal blocks.

Case 2: $j_0 = n$. Then $K_n = \mathbb F^n$, and $\{v, Av,\dots,A^{n-1}v\}$ is a basis of $\mathbb F^n$. It follows that there exists $a_0, a_1, \dots, a_{n-1} \in \mathbb F$ such that $$ A^n v = -a_0 v - a_1 Av - a_2 A^2 v - \cdots - a_{n-1} A^{n-1} v .$$ That is, setting $$p(\lambda) = \lambda^n + a_{n-1}\lambda^{n-1} + \cdots + a_0,$$ we have $$ p(A) v = 0 .$$ For any vector $w \in \mathbb F^n$, we have that $w = q(A) v$ for some polynomial $q$. Thus $$ p(A) w = p(A) q(A) v = q(A) p(A) v = 0 .$$ Hence $$ p(A) = 0 .$$ Finally with respect to the basis $\{v, Av,\dots,A^{n-1}v\}$, the matrix $A$ has the form of the companion matrix: $$ \begin{bmatrix} 0 & 0 & 0 & \cdots & 0 & 0 & -a_0 \\ 1 & 0 & 0 & \cdots & 0 & 0 & -a_1 \\ 0 & 1 & 0 & \cdots & 0 & 0 & -a_2 \\ \vdots & \vdots & \vdots & & \vdots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & 0 & 0 & -a_{n-3} \\ 0 & 0 & 0 & \cdots & 1 & 0 & -a_{n-2} \\ 0 & 0 & 0 & \cdots & 0 & 1 & -a_{n-1} \end{bmatrix} ,$$ and it is well known that the characteristic polynomial of the companion matrix is given by $$ p(\lambda) = \lambda^n + a_{n-1}\lambda^{n-1} + \cdots + a_0. $$

That's neat! It's obviously similar to the proof via Frobenius normal form, which I believe is presented in Dummit and Foote and in Hoffman and Kunze. — Ben Grossmann, May 03 '21 at 23:55

levap · Accepted Answer · 2021-05-04T09:07:47.537

This is a nice proof but is definitely not new (it became the "standard" proof in the linear algebra classes at our university in the last five years). I don't know a textbook reference but if you google "Cayley-Hamilton proof cyclic subspaces" all top hits will present this proof up to some modifications.

Just for the reference, let me present one such modification which avoids the "induction" step in this proof. Let $T \colon V \rightarrow V$ be a linear operator on a finite dimensional vector space and denote the characteristic polynomial of $T$ by $p_T(X)$. To show that $p_T(T) = 0$, it is enough to show that $p_T(T)u = 0_V$ for all $u \in V$. Fix some $u \in V$ and consider the cyclic subspace $U = \left< u \right> = \operatorname{Span} \{ T^i(u) \, | \, i \in \mathbb{N}_0 \}$ (you call this the maximal Krylov subspace). This is an invariant subspace so that the characteristic polynomial of $T|_{U}$ divides the characteristic polynomial of $T$. Writing $p_T(X) = q(X) \cdot p|_{T|_{U}}(X)$ for some $q \in \mathbb{F}[X]$, we have

$$ p_T(T)u = \left( q(T) \cdot p_{T|_{U}}(T) \right)u = q(T) \left( p_{T|_{U}}(T)u \right)$$

so it is enough to show that $p_{T|_{U}}(u) = 0$. Set $\dim U = k$. Then $u,T(u),\dots,T^{k-1}u$ is a basis for $U$ and if $$T^{k}(u) + a_{k-1}T^{k-1}(u) + \dots + a_1 T(u) + a_0 u = 0$$ then the matrix representing $T|_{U}$ with respect to the ordered basis $\left( u,T(u),\dots,T^{k-1}(u) \right)$ is the companion matrix of $$X^k + a_{k-1}X^{k-1} + \dots + a_1 X + a_0$$ and so $p|_{T|_{U}}(X) = X^k + a_{k-1}X^{k-1} + \dots + a_1 X + a_0$ which immediately gives $p|_{T|_{U}} \left( u \right) = 0$.

Here is another stackexchange reference with (more or less) this proof: https://math.stackexchange.com/questions/3696442/on-the-cayley-hamilton-theorem/3696467#3696467 — levap, May 04 '21 at 08:39

score 2 · Answer 2 · answered May 04 '21 at 09:28

For the record, I recall giving essentially the same proof in the last two paragraphs of this answer to the question about computing the minimal and characteristic polynomials of a companion matrix, explaining why it is useful to separately compute both directly, rather than to rely on Cayley-Hamilton to deduce the characteristic polynomial from the more easily computed minimal polynomial (which turns out to already have degree$~n$, the size of the companion matrix).

Proof of Cayley-Hamilton using Krylov subspaces

2 Answers2