5

Can someone provide intuition (not a proof) for why geometric multiplicity (i.e. the dimension of the $\lambda$-eigenspace) $\leq$ algebraic multiplicity (i.e. the number of times $\lambda$ appears as a root of $\det(A - \lambda I) = 0$).

Here is my bad intuition: Since there are $k$ linearly independent vector solutions to $(A - \lambda I)\vec{v} = 0$, where $k$ is the geometric multiplicity of $\lambda$, $\lambda$ must have a "degree" of at least $k$, which translates to algebraic multiplicity.

Hopefully, someone can provide some better intuition, perhaps in terms of a visualisation (I struggle to visualise the geometric significance of the algebraic multiplicity of $\lambda$).

4 Answers4

3

Here is an intuitive idea which can be translated into a proof although with some difficulty. Suppose that it were possible to perturb the matrix $M$ a very small amount so that all of its eigenvalues become distinct (this is possible because the set of matrices with distinct eigenvalues is Zariski open, hence dense). That means if it starts out with a $\lambda$-eigenspace with geometric multiplicity $m$ this single eigenspace splits into $m$ linearly independent eigenvectors with distinct eigenvalues $\lambda + \varepsilon_i, 1 \le i \le m$ where each $\varepsilon_i$ is very small. And each of these distinct eigenvalues contributes a distinct linear factor $(t - \lambda - \varepsilon_i)$ to the perturbed characteristic polynomial, so the perturbed characteristic polynomial is divisible by $\prod_{i=1}^m (t - \lambda - \varepsilon_i)$.

But the characteristic polynomial is continuous, so when we "unperturb" it back to our original matrix, it must have $m$ roots very close to the roots $\lambda + \varepsilon_i$. If we pick the perturbation to be small enough that each $\varepsilon_i$ is much smaller than the spacing between the distinct eigenvalues of $M$ the only possibility is that each of these $m$ roots of the original characteristic polynomial is $\lambda$.

One way to describe the idea here is that the characteristic polynomial is the "continuous minimal polynomial": it is a polynomial, whose coefficients are continuous functions of the entries of a matrix, which is equal to the minimal polynomial most of the time (when the eigenvalues are distinct). So, if you want to think about things this way, the algebraic multiplicity of an eigenvalue $\lambda$ is telling you: if you perturb this matrix slightly, you'll get this many eigenvalues near $\lambda$.


Separately, here is a proof, which I find pretty intuitive. You can show by repeatedly using the existence of eigenvectors that every $n \times n$ square matrix $M$ over an algebraically closed field can be upper triangularized; I explain the proof in the link.

So we can assume WLOG that $M$ is upper triangular. Then if $\lambda_1, \dots \lambda_n$ are the diagonal entries of this matrix we get that $\det(tI - M) = \prod (t - \lambda_i)$ so $\lambda_i$ are the roots of the characteristic polynomial. If a root $\lambda$ occurs with algebraic multiplicity $m$, the geometric multiplicity of $\lambda$ is $\dim \text{ker}(M - \lambda I)$. The matrix $M - \lambda I$ is also upper triangular and has $n - m$ nonzero diagonal entries, namely the entries $\lambda_i - \lambda$ for $\lambda_i \neq \lambda$; this means it has rank at least $n - m$ (since the corresponding rows are linearly independent), so by rank-nullity this gives $\dim \text{ker}(M - \lambda I) \le m$ as desired. Note that we do not need the Jordan normal form theorem which is probably the proof you've seen; upper triangularizing is much easier.

The upper triangularization also makes it easy to write down the necessary perturbation of the matrix I describe in the first part of the post, by adding a suitable very small diagonal matrix to $M$.

Edit: I see on the right that there's an old version of this question with a simpler version of this argument.

Qiaochu Yuan
  • 419,620
3

Another view that would come straight from a proof that provides an alternative view of algebraic multiplicity:

The algebraic multiplicity of $\lambda$ is $\dim\ker(A-\lambda I)^n$ for all large enough $n$.

So geometric multiplicity (stopping at level $n=1$) is a lower bound for algebraic multiplicity. Or equivalently eigenspace is a (not necessarily proper) subspace of generalised eigenspace.


The proof is, of course, induct on dimension. For $A=A_0\colon V\to V$ and $\lambda$ an eigenvalue of $A$, we start by picking an eigenvector $b_1\in V$, then consider $V/\langle b_1\rangle$ and the induced linear map $A_1\colon V/\langle b_1\rangle\to V/\langle b_1\rangle$. Note that any eigenvector $v\notin\langle b_1\rangle$ gives an eigenvector $v+\langle b_1\rangle$ of $A_1$ but there are also possibly more. Eventually $\lambda$ is no longer an eigenvalue of $A_m$, and so we can extend $b_1,b_2,\dots,b_m$ to a basis of $V$. With respect to this basis, $$ A=\begin{pmatrix} \lambda I_m+U_m & B\\ 0 & C \end{pmatrix} $$ where $U_m$ is strictly upper triangular, and $C-\lambda I$ is invertible. So $$ (A-\lambda I)^n=\begin{pmatrix} U_m^n & *\\ 0 & (C-\lambda I)^n \end{pmatrix} =\begin{pmatrix} 0 & *\\ 0 & (C-\lambda I)^n \end{pmatrix}\quad\forall n\geq m $$ and we get $\dim\ker(A-\lambda I)^n=m$ for all $n\geq m$.

user10354138
  • 33,239
0

Here's a short, intuitive picture over the reals that the geometric multiplicity of a real eigenvalue does not exceed that eigenvalue's algebraic multiplicity: If $V$ is a finite-dimensional Euclidean space and $T:V \to V$ is a linear operator, then $p(t) := \det(T - tI)$, a polynomial in $t$, is the $n$-dimensional volume of the image of the unit $n$-cube under $T - tI$.

If the $\lambda$-eigenspace $\ker(T - \lambda I)$ is $k$-dimensional, then "$k$ dimensions collapse" in the image of the unit cube as $t \to \lambda$, which contributes a factor $(t - \lambda)^{k}$ to the volume $p(t)$.

  • This is just a restatement of what the OP wants to understand, not a proof or a heuristic explanation. – Qiaochu Yuan Oct 07 '23 at 21:11
  • Maybe I mis-read/misconstrued; I understood OP to be asking for intuition of this type, not a proof...? – Andrew D. Hwang Oct 07 '23 at 21:12
  • 1
    Where is the intuition? You are just restating what it means for the geometric multiplicity to be $\le$ the algebraic multiplicity. The question is: why? – Qiaochu Yuan Oct 07 '23 at 21:16
  • @QiaochuYuan If there is a $k$-dimensional $\lambda$ eigenspace, then $\lambda$ is a root of the characteristic polynomial of order at least $k$ because of what the determinant means (at least over the reals) geometrically. But admittedly, that appears not to have been OP's question, or this answer restated an insight they already had, or I was not clear about the (to me, explanatory) geometric meaning of the determinant. – Andrew D. Hwang Oct 07 '23 at 22:42
  • Why? This is not a proof or even a heuristic argument. – Qiaochu Yuan Oct 07 '23 at 22:45
0

I'm not sure exactly what counts as the right intuition here -- over $\mathbb R$ for example, a nontrivial rotation has no eigenvalues or eigenvectors, so linear transformations of that kind appear to be left out in the cold if you seek to understand linear maps in terms of eigenvalues.

On the other hand, if you move to an algebraically closed field $\mathbb C$, then every linear map has an eigenvalue (which is equivalent to the statement that every polynomial has a root). Now an eigenvector of a linear map gives a $1$-dimensional invariant subspace: if $\alpha \colon V \to V$ and $\alpha(v_0)=\lambda_0 v_0$ with $v_0\neq 0$, then $L=\mathbb C.v_0$ is preserved by $\alpha$ and $\alpha_{|L}$ is just multiplication by $\lambda_0$. Thus on the direct sum of the eigenspaces of $\alpha$, the linear map just acts as scalar multiplication (by a different scalar on each different eigenspace).

Algebraic multiplicity arises, however, because linear maps in two dimensions get to be more interesting that linear maps in dimension $1$: Over $\mathbb R$ this was obvious because of things like rotations, but it remains true over $\mathbb C$, though the fact that $\mathbb C$ is algebraically closed constrains things:

If $\alpha\colon V\to V$ is a linear map where $\dim(V)=2$, then $\alpha$ has an eigenvector, $v_0$ say, with eigenvalue $\lambda_0$, and if $L=\mathbb C.v_0$, then $\alpha$ acts by scalar multiplication on $L$. Now for any $v_1 \notin L$ the set $\{v_0,v_1\}$ is a basis of $V$ and if $\alpha(v_1) = \mu_1.v_1+\mu_0.v_0$ for some $\mu_0,\mu_1 \in \mathbb C$, then $$ \begin{split} \alpha(v_1+c.v_0) &= \mu_1.v_1 +\mu_0 v_0+ c\lambda_0.v_0 \\ &= \mu_1.v_1+(\mu_0+c\lambda_0)v_0\\ \end{split} $$ so that if $\mu_1.c = \mu_0+c\lambda_0$, that is, if $c = \mu_0/(\lambda_0-\mu_1)$, then $v_1+c.v_0$ is an eigenvector with eigenvalue $\mu_1$. Thus $V$ has a basis of $\alpha$-eigenvectors unless $\mu_1=\lambda_0$ and $\mu_0 \neq 0$ in which case $V$ has the basis $\{u_0,u_1\}$ where $u_0=v_0,u_1=\mu_0^{-1}v_1\}$ with respect to which $\alpha$ has matrix $$ \left(\begin{array}{cc} \lambda_0 & 1 \\ 0 & \lambda_0 \end{array} \right) $$

From this matrix we see that $\lambda_0$ geometric multiplicity $1$ and algebraic multiplicity $2$. The terminology comes from considering eigenspaces and the characteristic polynomials, but the fact that the algebraic multiplicity is at least the geometric multiplicity just reflects the fact that in dimension $2$ and higher, one has the shear map $s\colon V\to V$, given by $s(u_0)=u_0$ and $s(u_1)=u_1+u_0$. Indeed $\alpha = (\lambda_0-1)\text{id}_V + s$.

krm2233
  • 4,380