Simple Nilpotent Linear Transformation Proof Misunderstanding

Question

Let $T \in \mathcal{L}(V,V)$ be nilpotent, where $\mathcal{L}(V,V)$ is the set of all linear transformations from $V$ to $V$. Let $n$ be the dimension of $V$. Show that $T^{n}=0$.

This has been asked before here: For an n x n matrix T, prove that if $[T]^k=[0]_n$, then $[T]^n=[O]_n$

but I can't wrap my brain around it.

I want to understand all the small details clearly and precisely in my head.

$\textbf{Here is are the facts I know/proof attempt}$:

Since $T$ is nilpotent, then $T^{k}=0$, $k > 0$.

I only know two facts:

There is a (non-trivial) polynomial $p_1(x)$ of degree $\leq n^2$ such that $p_1(T)=0$

Minimal polynomial: monic polynomial of lowest degree $p_2(x)$ such that $p_2(T)=0$

Now this link says For an n x n matrix T, prove that if $[T]^k=[0]_n$, then $[T]^n=[O]_n$: "Since $T^k=0$, the mimimal polynomial for $T$ in $\mathbb{R}[x]$ must divide $x^k$, hence must be of the form $p_2(x) = x^j$, for some $j \le k$."

$\textbf{I don't see how $T^k=0$ implies the minimal polynomial divides $x^k$. Why is this true?}$

This is what I understand, monic polynomial is written in form: $p_2(x)=\beta_{0}x+\ldots+\beta_{k}x^k+\ldots x^m$

Applying $T$: $p_2(T)=\beta_{0}T+\ldots+\beta_{k}T^k+\ldots T^m=0$.

Then, $p_2(T)=\beta_{0}T+\ldots+0+\ldots T^m=0$ since $T^k=0$.

$\textbf{Still I, don't see why this implies the minimal polynomial divides $x^k$}$

Furthermore, after this:

He says the degree of $p_2$ is at most $n$.

$\textbf{I don't understand why this is true, I thought the degree of $p_2$ is at most $n^2$}$.

Then he says: $$p_2(T) = 0 \implies T^j = 0 \implies T^n=0$$

Why does $$p_2(T) = 0 \implies T^j = 0 \implies T^n=0?$$ I don't fully grasp either of the implications.

Is he saying the minimal polynomial of the form: $p_2(x)=\beta_{0}x+\ldots x^j$ or is it $p_2(x)=x^j$?

As you can tell I'm very confused and just want to understand the ideas clearly. Thanks.

There is a polynomial of degree $n$ (not “of degree at most $n^2$”) with $p(A)=0$: the characteristic polynomial! — Arturo Magidin, Sep 02 '20 at 00:20
The minimal polynomial divides any polynomial $p(x)$ such that $p(A)=0$. To see this, do division with remainder of $p(x)$ by the minimal $m(x)$, and verify that the assumption of minimality of the degree of $m$ ensures the remainder is the zero polynomial. — Arturo Magidin, Sep 02 '20 at 00:21
Is it that specific proof that you want to understand? I think the simplest argument is the one that notes for any subspace $U\subseteq \mathbb{F}^n$ we have $U=TU\implies U=T^kU=0$. Further if $TU\subseteq U$ then $T^2U\subseteq TU$, so starting with $\mathbb{F}^n$ and repeatedly applying $T$ we have a strictly decreasing sequence of subspaces, $T^i(\mathbb{F}^n)$, $i=0,\cdots, r$, where $T^r=0$. Then the dimensions of these spaces are $n=d_0>d_1>\cdots>d_r=0$. Thus $r\leq n$. — tkf, Sep 02 '20 at 00:24
I would like to understand the above proof. I know that all the eigenvalues of $T$ are 0. I dont know much about the characteristic polynomial. Why does p(T)=0 where p is the characteristic polynomial? — splendor splendid, Sep 02 '20 at 00:33

score 1 · Accepted Answer · answered Sep 02 '20 at 00:54

Here are some facts/definitions. Let $V$ be a finite-dimensional vector space over a field $F$, and let $T:V\to V$ be linear.

Definition. We say a polynomial $f(t)\in F[t]$ is an annihilating polynomial for $T$ if $f(T) = 0$.
Theorem. A polynomial $f(t)\in F[t]$ is an annihilating polynomial for $T$ if and only if the minimal polynomial $\mu(t)$ divides $f(t)$. Note that $\impliedby$ is obvious. To prove $\implies$, we use the division algorithm to write $f(t) = \mu(t)g(t) + r(t)$ for some polynomials $g(t), r(t)$, with $\deg r(t) < \deg \mu(t)$. Now, using $0= f(T) = \mu(t) \circ g(T) + r(T) = r(T)$, argue why r(t) = 0$ is the zero polynomial.

This is why in your case, by definition of nilpotency, $T^k = 0$ for some integer $k\geq 1$. i.e the polynomial $t^k$ is an annihilating polynomial. By the theorem above, the minimal polynomial has to divide $t^k$. Thus, the minimal polynomial is $t^j$ for some $1\leq j \leq k$. Hence, $T^j = 0$.

Next, we have the following theorem:

Cayley Hamilton Theorem. If $p(t)$ is the characteristic polynomial (recall that this has degree $n:= \dim V$, with leading coefficient $1$ or $(-1)^n$ depending on the definition chosen) of $T$, then $p(T) = 0$. In other words, the characteristic polynomial of $T$ is also an annihilating polynomial of $T$, of degree $n = \dim V$

Therefore, combining these results, we have that since the characteristic polynomial $p(t)$ is an annihilating polynomial of $T$, the minimal polynomial of $T$ divides it. For a nilpotent operator, we showed that $\mu(t) = t^j$ for some $j$. Now, by minimality, we have $\deg \mu(t) \leq \deg p(t)$. Therefore, it follows $T^n = T^{(n-j)} \circ T^j = 0$.

By the way, you're right that we can always find a non-trivial polynomial $p_1(x)$ such that $\deg p_1(x) \leq n^2$ and $p_1(T) =0$. This follows by observing that the set of $n^2+1$ operators $\{I, T, \dots, T^{n^2}\}$ in the $n^2$-dimensional vector space $\mathcal{L}(V,V)$ is necessarily linearly dependent.

However, note that what the Cayley-Hamilton theorem guarantees us is a much stronger assertion that there is always a non-trivial polynomial $p(x)$ of degree $n$ such that $p(T) = 0$ (namely the Characteristic polynomial).

Why is the characteristic polynomial of $T$ an annihilating polynomial of $T$ (Cayley Hamilton Theorem)? I was able to prove that all the eigenvalues of $T$ are 0. Is It related to this fact? — splendor splendid, Sep 02 '20 at 01:00
@splendorsplendid that's a (important) theorem in its own right, and it has a completely separate proof (not too hard though) which you can easily find online. — user580918, Sep 02 '20 at 01:05
Also sorry I am new to this. I didn't understand the last few sentences. So the minimal polynomial is of form $u(T)=T^j$ for some $j$ where $1\leq j \leq k$ and T is a nilpotent operator. The characteristic polynomial is an annihilating monic polynomial of degree $n$. Since minimal polynomial is minimal it has degree $<=n$. Then how does it follow $T^{n}=T^(n-j) T^j=0$. I dont understand this last line with the composition and how it implies 0? — splendor splendid, Sep 02 '20 at 01:06
First of, $t$ and $T$ are different things. $t$ is the indeterminate variable used for polynomials (like $p(x), p(t)$ etc), but $T$ is an operator. The last line is pretty simple: do you agree $2^n = 2^{n-j}\cdot 2^j$? Because you just write out $2^n = \underbrace{2 \times \dots \times 2}_{n \text{ times}}$, and then regroup accordingly (a strict proof would require induction). It's the same thing here for operators. $T^n$ means you compose $T$ with itself $n$-times. So, it's the same as composing $n-j$ times and then $j$ times. $T^n= T^{n-j} \circ T^j$. — user580918, Sep 02 '20 at 01:10
(again a strict proof requires induction) Finally, if $T^j = 0$, then $T^n = T^{n-j}\circ T^j = T^{n-j}\circ 0 = 0$ (of course it is in this step where we're implicitly using the fact $j \leq n$ so that $n-j\geq 0$, because otherwise we can't write this equality, because we haven't defined $T^{\text{negative integer}}$). — user580918, Sep 02 '20 at 01:11

Simple Nilpotent Linear Transformation Proof Misunderstanding

1 Answers1