126

How can I prove that if I have $n$ eigenvectors from different eigenvalues, they are all linearly independent?

Corey L.
  • 1,279
  • This is equivalent to showing that a set of eigenspaces for distinct eigenvalues always form a direct sum of subspaces (inside the containing space). That is a question that has been asked many times on this site. I will therefore close this question as duplicate of one of them (which is marginally more recent than this one, but that seems hardly an issue after more than a decade). – Marc van Leeuwen May 18 '22 at 10:09

8 Answers8

172

I'll do it with two vectors. I'll leave it to you do it in general.

Suppose $\mathbf{v}_1$ and $\mathbf{v}_2$ correspond to distinct eigenvalues $\lambda_1$ and $\lambda_2$, respectively.

Take a linear combination that is equal to $0$, $\alpha_1\mathbf{v}_1+\alpha_2\mathbf{v}_2 = \mathbf{0}$. We need to show that $\alpha_1=\alpha_2=0$.

Applying $T$ to both sides, we get $$\mathbf{0} = T(\mathbf{0}) = T(\alpha_1\mathbf{v}_1+\alpha_2\mathbf{v}_2) = \alpha_1\lambda_1\mathbf{v}_1 + \alpha_2\lambda_2\mathbf{v}_2.$$ Now, instead, multiply the original equation by $\lambda_1$: $$\mathbf{0} = \lambda_1\alpha_1\mathbf{v}_1 + \lambda_1\alpha_2\mathbf{v}_2.$$ Now take the two equations, $$\begin{align*} \mathbf{0} &= \alpha_1\lambda_1\mathbf{v}_1 + \alpha_2\lambda_2\mathbf{v}_2\\ \mathbf{0} &= \alpha_1\lambda_1\mathbf{v}_1 + \alpha_2\lambda_1\mathbf{v}_2 \end{align*}$$ and taking the difference, we get: $$\mathbf{0} = 0\mathbf{v}_1 + \alpha_2(\lambda_2-\lambda_1)\mathbf{v}_2 = \alpha_2(\lambda_2-\lambda_1)\mathbf{v}_2.$$

Since $\lambda_2-\lambda_1\neq 0$, and since $\mathbf{v}_2\neq\mathbf{0}$ (because $\mathbf{v}_2$ is an eigenvector), then $\alpha_2=0$. Using this on the original linear combination $\mathbf{0} = \alpha_1\mathbf{v}_1 + \alpha_2\mathbf{v}_2$, we conclude that $\alpha_1=0$ as well (since $\mathbf{v}_1\neq\mathbf{0}$).

So $\mathbf{v}_1$ and $\mathbf{v}_2$ are linearly independent.

Now try using induction on $n$ for the general case.

Arturo Magidin
  • 398,050
  • 3
    I believe that you wrote $\lambda_2$ instead of $\lambda_1$ in the row before "Now take" – jacob Mar 14 '14 at 08:08
  • 11
    Is there any intuition behind this? Any pictorial way of thinking? – IgNite May 07 '16 at 08:42
  • Maybe this will be of use: from the $0 = \alpha_1 v_1 + ... + \alpha_n v_n$, if you do $$\left|\left|\lim_{j\to\infty} (1/ \lambda_1^j) A^j (\alpha_1 v_1 + ... + \alpha_n v_n)\right|\right|$$, from the definition of the eigenvalue (that $Av = \lambda v$), we can see the $v_1$ component will grow much faster than the others, so that limit equals $\alpha_1$, which equals $0$ if linearly independent. –  Apr 29 '17 at 15:54
  • @Arturo Very nice. –  May 27 '17 at 20:24
  • 3
    I don't understand how this generalises to N eigenvectors. – twisted manifold Jan 29 '20 at 23:57
  • 5
    @fielder: You don’t “generalize”; you use induction. As the answer states. – Arturo Magidin Jan 30 '20 at 03:47
  • What is the transformation T in this case? – Nathaniel Ruiz Jan 25 '21 at 18:42
  • 1
    @NathanielRuiz: Huh? the question is about eigenvectors corresponding to distinct eigenvalues of any particular linear transformation. There is no specific linear transformation at issue. – Arturo Magidin Jan 25 '21 at 18:44
  • @ArturoMagidin Sorry, I'm just confused how "T" (which I am assuming is a linear transformation) makes $T(\alpha_1v_1 + \alpha_2v_2) = a_1\lambda_1v_1 + \alpha_2\lambda_2v_2$. I just can't wrap my head around how the same linear transformation can create a $\lambda_1$ for the term with $v_1$ and a $\lambda_2$ for the term with $v_2$. – Nathaniel Ruiz Jan 25 '21 at 19:03
  • 2
    @NathanielRuiz: $v_1$ is assumed to be an eigenvector of $T$ corresponding to the eigenvalue $\lambda_1$; $v_2$ is assumed to be an eigenvector of $T$ corresponding to the eigenvalue $\lambda_2$. Do you not know the definition of "eigenvalue" and "eigenvector"? And since the $\alpha_i$ are scalars, $T(\alpha_1v_1+\alpha_2v_2) = \alpha_1T(v_1) + \alpha_2T(v_2)$. – Arturo Magidin Jan 25 '21 at 19:05
  • @ArturoMagidin OH It makes sense now, thank you! I confused myself because I did not realize that $T$ was a matrix itself. By using the word "applying" I thought T was a function, but I guess I was expecting something like the verb "multiplying" T from the left which is in fact what you did. Thank you again. – Nathaniel Ruiz Jan 25 '21 at 19:09
  • 2
    @NathanielRuiz: $T$ is not "a matrix itself". $T$ is a linear transformation. As such, it is indeed a function, and you are indeed utterly confused. You need to review what vector spaces and linear transformations are. – Arturo Magidin Jan 25 '21 at 19:10
  • @ArturoMagidin. Thank you. Could you please clarify what is $T$ and why you multiplied the original equation by $\lambda$ and not anything else. Why is it legal to multiple the original equation by $\lambda$ and then do subtraction? – Avv Feb 23 '21 at 16:46
  • 1
    @Avra: $T$ is a linear transformation. It's a function. If you have an equation, you are "allowed" to perform any operations on that equation. If you know $3x=2y+2$, you can multiply it by $5$ and get $15x = 10y+10$. Why did I multiply the equation by $\lambda$? Because that's what I need to make the argument work. If you have two equations, you can subtract one from the other. What exactly is the problem? This is basic algebra you are asking about! – Arturo Magidin Feb 23 '21 at 16:50
  • @ArturoMagidin. Thank you. It's very clear for me now. By I was wondering what makes you think to multiply the original equation by $\lambda_1$ and then subtract the first equation from the second equation? Why is it okay to multiply the original equation by a number $x$ and the same original equation by another number $y$ and then subtract them? – Avv Feb 23 '21 at 16:55
  • 1
    @Avra: What makes it okay? Basic algebra. Basic properties of equality. If $A=B$, then $f(A)=f(B)$ for any function $f$; for example, "multiply by $\lambda_1$", or "apply the linear transformation $T$ to both sides". And if $A=B$ and $C=D$, then $A+C=B+D$, $A-C=B-D$. What makes it okay? Again, basic algebra. Basic properties of what "$=$" means. – Arturo Magidin Feb 23 '21 at 18:32
  • When multiplying by $\lambda_1$ dont we need that to be non zero? – Upstart Feb 20 '22 at 18:40
  • @Upstart No. Work it through and see why not. – Arturo Magidin Feb 20 '22 at 19:06
  • I am not sure howe can prove this for the case $n=k+1$ by assuming it holds true for $n=k$. Can someone please help? – Kitwradr Nov 12 '22 at 02:24
  • @Kitwradr Did you go look at the duplicate? There is a roof there. – Arturo Magidin Nov 12 '22 at 02:48
  • Couldn't you have repeated eigenvalues? $\lambda_1=\lambda_2$? – JDoe2 Dec 07 '23 at 13:32
  • @JDoe2 The question says "eigenvectors from different eigenvalues". So, no, – Arturo Magidin Dec 07 '23 at 14:49
65

Alternative:

Let $j$ be the maximal $j$ such that $v_1,\dots,v_j$ are independent. Then there exists $c_i$, $1\leq i\leq j$ so that $\sum_{i=1}^j c_iv_i=v_{j+1}$. But by applying $T$ we also have that

$$\sum_{i=1}^j c_i\lambda_iv_i=\lambda_{j+1}v_{j+1}=\lambda_{j+1}\sum_{i=1}^j c_i v_i$$ Hence $$\sum_{i=1}^j \left(\lambda_i-\lambda_{j+1}\right) c_iv_i=0$$ which is a contradiction since $\lambda_i\neq \lambda_{j+1}$ for $1\leq i\leq j$.

Hope that helps,

Eric Naslund
  • 72,099
  • 11
    P.S. The argument uses the well-ordering principle on the naturals (by looking at the least $j$ such that $v_1,\ldots,v_{j+1}$ is dependent). Well-ordering for the naturals is equivalent to induction. – Arturo Magidin Mar 28 '11 at 01:20
  • @Arturo: Thanks! Should be fixed now. (Didn't realize I was using well ordering in that way) – Eric Naslund Mar 28 '11 at 01:30
  • 1
    No problem; I'll delete the other two comments since it's been fixed. In any case, many people prefer arguments along these lines to an explicit induction, even if they are logically equivalent (you can think of the argument you give as a proof by contradiction of the inductive step, with the base being taken for granted [or as trivial, since $v_1$ is nonzero]; cast that way, it may be clearer why the two arguments are closely connected). – Arturo Magidin Mar 28 '11 at 01:34
  • 1
    @Eric Naslund This proofs seems very similar to the one given in Axler... –  Aug 31 '11 at 07:29
  • @D B Lim: What is Axler? – Eric Naslund Aug 31 '11 at 14:42
  • 2
    @Eric Naslund Sheldon Axler's Linear Algebra Done Right. –  Sep 03 '11 at 21:52
  • A little late to the party perhaps, but I just wan't to point out that this answer is probably more useful having the knowledge of: https://math.stackexchange.com/questions/1601944/induction-and-contradiction combined with the more "common" way to prove this via induction. – WorseThanEuler Jul 09 '21 at 15:52
  • Are all the $c_i \neq 0 $ ? I am having trouble seeing the contradiction. I think the contradiction goes as follows. Since $\sum_{i=1}^j c_iv_i=v_{j+1}$ and $v_{j+1} \neq 0$, then at least one of the $c_i$ for $1\leq i\leq j$ must be nonzero else $v_{j+1} = 0$. Then we derive a contradiction from the following facts, viz. $\sum_{i=1}^j \left(\lambda_i-\lambda_{j+1}\right) c_iv_i=0$ , $c_i \neq 0 $ for some $i$, $ v_i \neq 0 $ for all $i$, and $\lambda_i\neq \lambda_{j+1} $ for $1\leq i\leq j$ – john Jul 17 '21 at 11:49
  • @EricNaslund if you could critique the previous comment. thx. – john Jul 17 '21 at 15:52
  • @john Since $v_1,\dotsc,v_j$ are linearly independent, we must have $\left(\lambda_i-\lambda_{j+1}\right)c_i = 0$ for all $i$, and since $\lambda_i \neq \lambda_{j+1}$, this reduces to $c_i = 0$ for all $i$. (And, to spell it out, that means $v_{j+1} = \sum_{i=1}^{j}c_i v_i = 0$, contradicting the fact that eigenvectors have to be non-zero.) – Prasiortle Mar 25 '24 at 13:28
18

Hey I think there's a slick way to do this without induction. Suppose that $T$ is a linear transformation of a vector space $V$ and that $v_1,\ldots,v_n \in V$ are eigenvectors of $T$ with corresponding eigenvalues $\lambda_1,\ldots,\lambda_n \in F$ ($F$ the field of scalars). We want to show that, if $\sum_{i=1}^n c_i v_i = 0$, where the coefficients $c_i$ are in $F$, then necessarily each $c_i$ is zero.

For simplicity, I will just explain why $c_1 = 0$. Consider the polynomial $p_1(x) \in F[x]$ given as $p_1(x) = (x-\lambda_2) \cdots (x-\lambda_n)$. Note that the $x-\lambda_1$ term is "missing" here. Now, since each $v_i$ is an eigenvector of $T$, we have \begin{align*} p_1(T) v_i = p_1(\lambda_i) v_i && \text{ where} && p_1(\lambda_i) = \begin{cases} 0 & \text{ if } i \neq 1 \\ p_1(\lambda_1) \neq 0 & \text{ if } i = 1 \end{cases}. \end{align*}

Thus, applying $p_1(T)$ to the sum $\sum_{i=1}^n c_i v_i = 0$, we get $$ p_1(\lambda_1) c_1 v_1 = 0 $$ which implies $c_1 = 0$, since $p_1(\lambda_1) \neq 0$ and $v_1 \neq 0$.

Mike F
  • 22,196
  • You have a typo in your answer and I can't edit it. $p_1(x) is missing a space after the dollar sign. – Bag of Chips Apr 30 '19 at 08:01
  • Could not understand what is x in p_1(x) as x-\lambda suggests that it should be scalar but p_1(T) suggests as if it is a transformation matrix. Please clarify. – user3779172 Mar 28 '20 at 12:33
  • @user3779172 Given a polynomial $p(x)$ and a matrix $T$, one writes $p(T)$ for the matrix obtained by replacing all the $x$s by $T$s. Note $x^0$ becomes $T^0$ which is to be interpreted as the identity matrix. For example, if $p(x) = 5x^2+3$ and $T = \begin{bmatrix}1 & 1 \ 0 & 1 \end{bmatrix}$, then $p(T) = 5\begin{bmatrix}1 & 1 \ 0 & 1 \end{bmatrix}^2 +3 \begin{bmatrix} 1 & 0 \ 0 & 1 \end{bmatrix}$. If you know some abstract algebra, then you can fit this into a more general story about polynomial rings and evalation homomorphisms. https://planetmath.org/evaluationhomomorphism – Mike F Mar 28 '20 at 18:30
6

For eigenvectors $\vec{v^1},\vec{v^2},\dots,\vec{v^n}$ with different eigenvalues $\lambda_1\neq\lambda_2\neq \dots \neq\lambda_n$ of a $ n\times n$ matrix $A$.

Given the $ n\times n$ matrix $P$ of the eigenvectors (with eigenvectors as the columns). $$P=\Big[\vec{v^1},\vec{v^2},\dots,\vec{v^n}\Big]$$

Given the $ n\times n$ matrix $\Lambda$ of the eigenvalues on the diagonal (zeros elsewhere): $$\Lambda = \begin{bmatrix} \lambda_1 & 0 & \dots & 0 \\ 0 & \lambda_2 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & \lambda_n \end{bmatrix} $$ Let $\vec{c}=(c_1,c_2,\dots,c_n)^T$

We need to show that only $c_1=c_2=...=c_n=0$ can satisfy the following: $$c_1\vec{v^1}+c_2\vec{v^2}+...= \vec{0^{}}$$ Applying the matrix to this equation gives: $$c_1\lambda_1\vec{v^1}+c_2\lambda_2\vec{v^2}+...+c_n\lambda_n\vec{v^n}= \vec{0^{}}$$ We can write this equation in the form of vectors and matrices:

$$P\Lambda \vec{c^{}}=\vec{0^{}}$$

But with since $A$ can be diagonalised to $\Lambda$, we know $P\Lambda=AP$ $$\implies AP\vec{c^{}}=\vec{0^{}}$$ since $AP\neq 0$, we have $\vec{c}=0$.

  • Let me add a comment to the last. In order to have $c=0$ the only solution, $AP$ must be invertible but since $A$ is already invertible because $\lambda_i \neq \lambda_j \forall i\neq j$, the only demand is $P$ to be invertible $\Rightarrow$ independent eigenvectors. – Thoth Sep 13 '16 at 16:36
  • why is $P \ne A$? – twisted manifold Jan 30 '20 at 00:06
  • 1
    I think we cannot be sure of that AP is invertible as we don't know if P has linearly independent columns – user3779172 Mar 28 '20 at 12:23
3

Suppose $v_k$ are eigenvectors of $A$, that is, $\sum_k c_k v_k=0$ with $v_k\neq0$, and $Av_k=\lambda_k v_k$. We only include in this sum eigenvectors corresponding to distinct eigenvalues, so that $\lambda_i\neq\lambda_j$ for $i\neq j$. We want to prove that this implies $c_k=0$.

Define the operators $$A^{(i)}\equiv \prod_{k\neq i}(A-\lambda_k I).$$ In particular, note that $A^{(i)}v_k=\delta_{ik} d_i$, where $d_i\equiv \prod_{k\neq i}(\lambda_i-\lambda_k)\neq0$, and $\delta_{ik}$ is the standard Dirac delta function. Then $$0 = A^{(i)}(0) = A^{(i)}\left(\sum_k c_k v_k\right) =d_i c_i v_i.$$ It follows that $c_i=0$ for all $i$ (because, again, $d_i\neq0$ and $v_i\neq0$), and thus the vectors $\{v_i\}_i$ are linearly independent.

glS
  • 6,818
  • 1
    This is a good argument that does not explicitly use induction. The punchline could have been delivered a bit better though; I would have mentioned explicitly the assumption $v_i\neq0$ (because $v_i$ is an eigenvector), and started after the display by "Then $0=A^{(i)}(0)=A^{(i)}(\sum\ldots)$" so that it is clear that from $0=d_ic_iv_i$ one can indeed deduce $c_i=0$. It may be worth noting that, restricted to the sum of eigenspaces in question, the operator $A^{i)}/d_i$ is the projection on the eigenspace for $\lambda_i$, which projections can only exist if the sum is direct. – Marc van Leeuwen Jul 03 '21 at 07:03
  • @MarcvanLeeuwen thank you for the suggestions. I hopefully made the answer clearer – glS Jul 15 '21 at 11:22
2

Let $A$ be a $n \times n$ matrix with pairwaise different eigenvalues $\lambda_1, \ldots, \lambda_n$ and their eigenvectors $v_1,\ldots,v_n$. Put $$P=[v_1\;\cdots\;v_n],\qquad \Lambda=\left[\begin{array}{ccc} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n\end{array}\right],\qquad V=\left[\begin{array}{cccc} 1 & \lambda_1 & \cdots & \lambda_1^{n-1} \\ \vdots & \vdots & \ddots &\vdots \\ 1 & \lambda_n & \cdots & \lambda_n^{n-1}\end{array}\right]$$ $$C=\left[\begin{array}{c} c_1 \\ \vdots \\ c_n\end{array}\right],\qquad C'=\left[\begin{array}{ccc} c_1 & & \\ & \ddots & \\ & & c_n\end{array}\right]$$

Suppose $PC=c_1v_1+\ldots+c_nv_n=O_{n,1}$. Our aim is to show that $PC'=O_{n,n}$ (equivalently $c_iv_i=O_{n,1}$ for $i=1,\ldots,n$, which implies $c_i=0$, because $v_i\neq O_{n,1}$ for all $i$).

It is clear that $AP=[Av_1\;\cdots\;Av_n] = [\lambda_1v_1\;\cdots\;\lambda_nv_n] = P\Lambda$. Then $$O_{n,1} = AO_{n,1} = A(PC) = (AP)C = (P\Lambda)C = P(\Lambda C),$$ so $P(\Lambda C)=O_{n,1}$ and inductively $P(\Lambda^kC)=O_{n,1}$ for $k=0,1,2,\ldots$.

Matrix $V$ is Vandermonde matrix and it is invertible because $\det V = \prod_{1\le i < j \le n}(\lambda_j-\lambda_i)\neq0$.

We have $$PC' = PC'(VV^{-1}) = P(C'V)V^{-1} = P\left[\begin{array}{cccc} c_1 & \lambda_1c_1 & \cdots & \lambda_1^{n-1}c_1 \\ \vdots & \vdots & \ddots &\vdots \\ c_n & \lambda_nc_n & \cdots & \lambda_n^{n-1}c_n \end{array}\right]V^{-1} \\ = P[C\;\;\Lambda C\;\; \cdots\;\; \Lambda^{n-1}C]V^{-1} = [PC\;\;P\Lambda C\;\; \cdots\;\; P\Lambda^{n-1}C]V^{-1} = O_{n,n}V^{-1} = O_{n,n},$$ completing the proof.

1

I would like to add a bit of intuition.

Suppose v1, v2, w are eigenvectors of M with different eigenvalue. and

w = ⍺1v1 + ɑ2v2

w as composition of v1 and v2

Think of M as a transformation, for w to come out in the same direction, each of its components needs to be scaled proportionally (by the same amount). That is, v1 and v2 need to have identical eigenvalue, which contradicts with the assumption.

0

Consider the matrix $A$ with two distinct eigen values $\lambda_1$ and $\lambda_2$. First note that the eigen vectors cannot be same , i.e., If $Ax = \lambda_1x$ and $Ax = \lambda_2x$ for some non-zero vector $x$. Well , then it means $(\lambda_1- \lambda_2)x=\bf{0}$. Since $\lambda_1$ and $\lambda_2$ are scalars , this can only happen iff $x = \bf{0}$ (which is trivial) or when $\lambda_1 =\lambda_2$ .

Thus now we can safely assume that given two eigenvalue , eigenvector pair say $(\lambda_1, {\bf x_1})$ and $(\lambda_2 , {\bf x_2})$ there cannot exist another pair $(\lambda_3 , {\bf x_3})$ such that ${\bf x_3} = k{\bf x_1}$ or ${\bf x_3} = k{\bf x_2}$ for any scalar $k$. Now let ${\bf x_3} = k_1{\bf x_1}+k_2{\bf x_2}$ for some scalars $k_1,k_2$

Now, $$ A{\bf x_3}=\lambda_3{\bf x_3} \\ $$ $$ {\bf x_3} = k_1{\bf x_1} + k_2{\bf x_2} \:\:\: \dots(1)\\ $$ $$ \Rightarrow A{\bf x_3}=\lambda_3k_1{\bf x_1} + \lambda_3k_2{\bf x_2}\\$$ but, $\:\: {\bf x_1}=\frac{1}{\lambda_1}Ax_1$ and ${\bf x_2}=\frac{1}{\lambda_2}Ax_2$. Substituting in above equation we get, $$A{\bf x_3}=\frac{\lambda_3k_1}{\lambda_1}A{\bf x_1} + \frac{\lambda_3k_2}{\lambda_2}A{\bf x_2} \\$$

$$\Rightarrow {\bf x_3}=\frac{\lambda_3k_1}{\lambda_1}{\bf x_1} + \frac{\lambda_3k_2}{\lambda_2}{\bf x_2} \:\:\: \dots (2)$$

From equation $(1)$ and $(2)$ if we compare coefficients we get $\lambda_3 = \lambda_1$ and $\lambda_3 = \lambda_2$ , which implies $\lambda_1 = \lambda_2 = \lambda_3$ but according to our assumption they were all distinct!!! (Contradiction)

NOTE: This argument generalizes in the exact same fashion for any number of eigenvectors. Also, it is clear that if $ {\bf x_3}$ cannot be a linear combination of two vectors then it cannot be a linear combination of any $n >2$ vectors (Try to prove!!!)