Formulas for the (top) coefficients of the characteristic polynomial of a matrix

Question

The characteristic polynomial of a matrix $A$ is defined as:

$$\chi(A) = \det(xI-A) = \sum_{i=0}^n (-1)^i\cdot \operatorname{tr}^{(i)}(A) \cdot x^{n-i}$$

The trace is the sum of the eigenvalues of a matrix, tr(A) = tr⁽¹⁾(A). It is also the sum of the diagonal entries:

$$\operatorname{tr}(A) = \sum_{i=1}^n A_{ii}$$

The sum of the products of pairs of eigenvalues is like the next trace.

Is this formula valid?

$$\operatorname{tr}^{(2)}(A) = \sum_{1 \leq i < j \leq n } A_{ii} A_{jj} - A_{ij} A_{ji}$$

What about this one?

$$\operatorname{tr}^{(2)}(A) = \tfrac12(\operatorname{tr}(A)^2 - \operatorname{tr}(A^2))$$

Are there corresponding formulas for the next one, tr⁽³⁾?

Qiaochu Yuan · Accepted Answer · 2011-02-27T12:46:56.843

26

The first guess is correct and follows directly from the permutation expansion of the determinant. The second guess is also correct. The generalization is

$$\text{tr}^{(n)}(A) = \text{tr } \Lambda^n(A) = \frac{1}{n!} \sum_{\pi \in S_n} \text{sgn}(\pi) \text{tr}_{\pi}(A)$$

where $\Lambda^n(A)$ is the action of $A$ on the exterior power $\Lambda^n$, $\text{tr}_n(A) = \text{tr}(A^n)$ and, for a permutation $\pi$ with cycle type $(\lambda_1, ... \lambda_k)$, we define $\text{tr}_{\pi} = \text{tr}_{\lambda_1} ... \text{tr}_{\lambda_k}$.

For example, $S_3$ has one element with cycle type $(1, 1, 1)$, three elements with cycle type $(1, 2)$, and two elements with cycle type $(3)$. It follows that

$$\text{tr}^{(3)}(A) = \frac{1}{6} \left( \text{tr}(A)^3 - 3 \text{tr}(A^2) \text{tr}(A) + 2 \text{tr}(A^3) \right).$$

This identity is really an identity of symmetric functions expressing the elementary symmetric functions in terms of the power symmetric functions. It can be deduced from Newton's identities or the exponential formula. You can find a more thorough exposition in Stanley's Enumerative Combinatorics, Vol. II, Chapter 7.

Some exposition is in order regarding what this has to do with exterior powers. Suppose that $A$ is diagonalizable with eigenvalues $\lambda_1, ... \lambda_n$ and eigenvectors $e_1, ... e_n$. Then of course $\text{tr}(A) = \sum \lambda_i$. More generally $\text{tr}^{(k)}(A) = \sum_{i_1 < ... < i_k} \lambda_{i_1} ... \lambda_{i_k}$ is the $k^{th}$ elementary symmetric polynomial in the eigenvalues.

Now, if $A$ acts on a vector space $V$, then the exterior power $\Lambda^k(V)$ inherits an action of $A$ by functoriality. In this basis, it is particularly easy to describe: it is given by

$$\Lambda^k(A)(e_{i_1} \wedge ... \wedge e_{i_k}) = \lambda_{i_1} ... \lambda_{i_k} e_{i_1} \wedge ... \wedge e_{i_k}.$$

Hence the wedges of all possible ordered $k$-tuples of distinct eigenvectors of $A$ forms an eigenbasis for $\Lambda^k(A)$ acting on $\Lambda^k(V)$. It follows that $\text{tr } \Lambda^k(A) = \text{tr}^{(k)}(A)$ for all diagonalizable $A$, hence for all $A$ by continuity. Of course you can ignore all of this if you want to, but I thought I would mention it because it is a precise sense in which the other coefficients of the characteristic polynomial encode "higher traces."

edited Feb 27 '11 at 12:46

answered Feb 26 '11 at 21:29

Qiaochu Yuan

419,620

What is Lambda^n(A) and how is the formula on the right related to to the one I wrote? Are you missing a sign(pi) or something? – Jack Schmidt Feb 26 '11 at 21:32
Yes, I was missing a sign. Lambda^n(A) is the action of A on the nth exterior power and the trace of the corresponding operator is precisely your tr^n. – Qiaochu Yuan Feb 26 '11 at 21:34
Your answer is not very helpful to me. It just names a bunch of things and doesn't describe how they fit together at all. – Jack Schmidt Feb 26 '11 at 21:38
1

@Jack: I think that's a little harsh. What would you like me to elaborate on? I gave you the completely general formula like you asked for. – Qiaochu Yuan Feb 26 '11 at 21:38
I did not ask for a completely general formula. What does anything you have said have to do with my question? – Jack Schmidt Feb 26 '11 at 21:41
@Jack: I really do not understand this reaction. The formula I gave is the correct generalization of your formula for tr^{(2)} to tr^{(n)} for all n. I specialized it to n = 3 as an example. – Qiaochu Yuan Feb 26 '11 at 21:47
What is all of this about this coming up before? Why do you mention it? Is it supposed to help me in some way? The links appear completely irrelevant to me. Stanley's EC is hundreds of pages long. Can you delete all of the irrelevant parts so I can concentrate on the parts that are actually supposed to be helpful? – Jack Schmidt Feb 26 '11 at 21:50
Why is Lambda related to the characteristic polynomial? – Jack Schmidt Feb 26 '11 at 21:51
8

@Jack: I have edited the answer and added an explanation. I hope it is now to your liking, but I really do not understand why you are being so rude. – Qiaochu Yuan Feb 26 '11 at 21:59
This version is making quite a bit more sense. It gives a concrete answer to two of my questions and does not insult either the asker or the asker's colleagues. – Jack Schmidt Feb 26 '11 at 22:15
In your last paragraph is "ordered k-tuples" being used in the non-standard sense of [e1,e2,e3] being allowed means e1 < e2 < e3, rather than both [e1,e2,e3] and [e1,e3,e2] being allowed? – Jack Schmidt Feb 26 '11 at 22:23
1

@Jack: in the exterior algebra one of those is the negative of the other, so it doesn't matter too much, but I meant that the indices should be increasing (so we get a basis as opposed to a spanning set). I am sorry if you interpreted my original answer as being insulting; it is just mildly frustrating giving the same answer to questions which are not identical enough to be duplicates... – Qiaochu Yuan Feb 26 '11 at 22:26
Dear @Jack Schmidt, there is no space for such behaviour on this site. As you are a rather expert user, you should be aware that asking for clarifications is enough. Best, Andrea – Andrea Marino Jan 06 '21 at 14:37

Eric Naslund · Answer 2 · 2011-02-26T22:06:33.067

I thought I might add a bit on the background of what we are talking about.

Consider a vector space $V$ with a linear operator $A$, and lets choose a basis $e_i$. Let $A\otimes A$ be the linear operator on $V\otimes V$ given by $A\otimes A (e_i\otimes e_j)=Ae_i\otimes Ae_j$. It is easy to see that $tr(A\otimes A)=tr(A)^2$. Let $\theta$ be the automorphism of $V\otimes V$ given by $\theta(e_i\otimes e_j)=e_j\otimes e_i$. We can define $Sym^2(V)$, the Symmetric Square to be the set of all elements in $V\otimes V$ which are annihilated by $\theta -Id$, and $Alt^2(V)$, the Alternating Square, to be the set of all elements that are annihilated by $\theta + Id$. Then, $V\otimes V$ decomposes into $Sym^2(V)\oplus Alt^2 (V)$ and what you are then looking for is the Trace of $A\otimes A$ when restricted to the alternating square. (Note: restriction to an invariant subspace gives a linear operator, so we can take its Trace)

Why is this Trace the sum of $\lambda_i \lambda_j$? First, the set of all $e_i\otimes e_j+e_j\otimes e_i$ is a basis for $Sym^2(V)$ and the set of all $e_i\otimes e_j - e_j\otimes e_i$ is a basis for $Alt^2(V)$. To see this, write a vector in either subspace in the basis for $V\otimes V$ and consider the fact that it is annihilated by either $\theta +Id$ or $\theta -Id$. Applying these operators gives and equation, and allows us to write the original vector in one of the bases I mentioned above.
Now lets do the case where $A$ is diagonalizable, and the $e_i$ above form an eigenbasis. Then $$A\otimes A (e_i\otimes e_j - e_j\otimes e_i)= \lambda_i\lambda_j(e_i\otimes e_j - e_j\otimes e_i)$$ Hence $\lambda_i\lambda_j$ are exactly the eigenvalues of $A\otimes A$ when restricted to $Alt^2(V)$.

In more generality, the sum of the products of $n$ distinct eigenvalues will be the Trace of $A\otimes \cdots \otimes A$ restricted to the intersection of all subspaces annihilated by $\theta_\pi +Id$ where $\theta_\pi $ is some automorphism given by an odd permutation. This subspace is sometimes called the Exterior Power.

Now, how do we actually calculate a formula for these Traces based on the original matrix $A$? That is supplied in Qiaochu's Answer.

Why am I looking for the trace of A⊗A restricted to Alt^2(V)? All previous is good, and the following seems like it will be good once I understand this question. — Jack Schmidt, Feb 26 '11 at 21:55
Thanks! And for the higher tensor powers, we only get one subspace, right? Though V⊗V⊗V might break up into three or whatever pieces, we only want the piece called the 3rd exterior power, right? — Jack Schmidt, Feb 26 '11 at 22:13
Ah, I see the word "intersection" now. Yes, so we get one subspace. Making sure θ+I annihilates it makes sure we never get any repeated eigenvalues in the product, I guess. — Jack Schmidt, Feb 26 '11 at 22:19
@Jack: Ya, I put intersection a few edits ago, but probably after the first time you read it. There is indeed online one space, and it is annihilated by $\theta_T+Id$ for each transposition $T$. This gives the familiar skew symmetry $ x_{\sigma(1)}\wedge x_{\sigma(2)}\wedge\dots\wedge x_{\sigma(k)} = \operatorname{sgn}(\sigma)x_1\wedge x_2\wedge\dots \wedge x_k$. — Eric Naslund, Feb 26 '11 at 22:28

Jack Schmidt · Answer 3 · 2011-02-27T01:08:10.447

We want a "simple" formula for the coefficients of the characteristic polynomial in terms of the entries of the matrix, at least for the top few coefficients.

The characteristic polynomial can be written in terms of the eigenvalues:

$$ \chi(A) = (x-\lambda_1)(x-\lambda_2)\cdots(x-\lambda_n) $$ $$= x^n - (\lambda_1 + \dots + \lambda_n) x^{(n-1)} + (\lambda_1 \lambda_2 + \lambda_1 \lambda_3 + \dots + \lambda_{n-1} \lambda_n) x^{(n-2)} - \dots + (-1)^n \lambda_1 \lambda_2 \cdots \lambda_n $$

Each coefficient tr^(k) is just (±) the sum of the product of all k-sets of the eigenvalues λ_i. We'd like to use the regular trace to write this sum down, and so we try to find some crazy matrix Λ^(k)(A) whose eigenvalues are exactly those products!

For k=1 this is easy: just use A itself. For k=0, this is degenerate: just use [1]. For k=n, there is a sneaky trick: use [ det(A) ]. That is basically cheating, but it turns out to be part of a larger pattern.

To define B = Λ^(k)(A) we consider all k-subsets K of {1,2,…,n} and use those to index the rows and columns of B. For two such k-subsets K,L define the (K,L)th entry of B to be the determinant of the submatrix of A formed by taking only the rows from K and the columns from L. In other words, B is the matrix formed of the k-minors of A.

For instance if K={i,j} and L={i,j}, then the (K,L)th entry is A_iiA_jj−A_ijA_ji. The trace of this matrix is just the sum of the (K,K)th entries as K varies over all k-subsets of {1,2,…,n}. For k=2, this is exactly the first formula given:

$$\operatorname{tr}^{(2)}(A) = tr(B) = \sum_{1 \leq i < j \leq n} A_{ii}A_{jj}-A_{ij}A_{ji}$$

Now for k=3, the formula is a little bit awful, but primarily because the formula for a 3×3 determinant is a little awful.

$$\operatorname{tr}^{(3)}(A) = tr(B) = \sum_{1\leq i<j<k \leq n} \left|\begin{array}{rrr}A_{ii} & A_{ij} & A_{ik} \\ A_{ji} & A_{jj} & A_{jk} \\ A_{ki} & A_{kj} & A_{kk} \end{array}\right|$$ $$= \sum_{1 \leq i < j < k \leq n} A_{ii}A_{jj}A_{kk}+A_{ij}A_{jk}A_{ki}+A_{ik}A_{ji}A_{kj}-A_{ik}A_{jj}A_{ki}-A_{ij}A_{ji}A_{kk}-A_{ii}A_{jk}A_{kj}$$

Now relating Eric Naslund's answer:

The Kronecker product of two matrices, M⊗N is another related construction. Its rows are indexed by pairs (i1,i2) of row indices (i1 from M, and i2 from N), its columns are indexed by pairs (j1,j2) of column indices (j1 from M, and j2 from N), and the ((i1,i2),(j1,j2))th entry is just the the product of the (i1,j1)st entry of the first matrix with (i2,j2)nd entry of the second matrix. In other words, one gets a block matrix where one the first matrix decides a common multiplier on the (i1,j1)st block, and the second matrix simply provides that block.

For instance

$$\begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix} \otimes \begin{pmatrix} 5 & 6 & 7 \\ 8 & 9 & 10 \\ 11 & 12 & 13 \end{pmatrix} = \left(\begin{array}{rrr|rrr}0&0&0&5&6&7\\0&0&0&8&9&10\\0&0&0&11&12&13\\\hline -5 & -6 & -7 & 0&0&0 \\ -8 & -9 & -10 & 0&0&0 \\ -11&-12&-13&0&0&0 \end{array}\right) $$

The Kronecker product has the nice feature that its eigenvalues are the products of eigenvalues, one from the first matrix and one from the second. In particular, for A⊗A, one gets all products of eigenvalues, even λ₁λ₁ and both λ₁λ₂ and λ₂λ₁.

The Kronecker product of the two matrices acts on V⊗W, the vector space whose basis consists of pairs of basis vectors, one from V and one from W, and where M acts on V and N acts on W.

This vector space V⊗V is too large, it gives not only the products of 2-sets of eigenvalues, but also any ordered 2-tuple, including (1,1) and both (1,2) and (2,1). However, there is a nice subspace called the exterior power which is the intersection of the kernels of all those coordinate switchers mentioned by Eric Naslund. The matrix of A acting on the k'th exterior power Λ^(k)(V) is just the matrix of k-minors, Λ^(k)(A).

The eigenvalues divide up nicely in the second tensor power: the subspace spanned by x⊗x is called the symmetric power Sym²(V) and it gets one copy of each of the products λ_iλ_j for i ≤ j, including all of the repeats λ_iλ_i, so it has dimension n(n+1)/2. The space spanned by all x⊗y−y⊗x is called the exterior power Alt²(V) and it gets the leftover eigenvalues, one copy each of the λ_iλ_j for i < j, so it has dimension n(n-1)/2.

For higher tensor powers the eigenvalues still have to be divided up, and one still has spaces Sym^k(V) and Alt^k(V)=Λ^(k)(V), and these get their corresponding eigenvalues: all products of non-decreasing k-tuples of eigenvalues λ_i⋯λ_j with 1≤i≤…≤j≤n for Sym^k(V) and all products of increasing k-tuples (or just plain k-sets) of eigenvalues for Alt^k(V). Weirdly though, that doesn't add up to the total dimension, and there still other subspaces left! These are called Schur functors and apparently don't have too direct a relation on this question, but you can see they have to exist by looking at the third tensor power of a three dimensional space (a vector space of dimension 3×3×3=27): Sym3 has (1,1,1), (1,1,2), (1,2,2), (1,2,3), (1,3,3), (2,2,2), (2,2,3), (2,3,3), (3,3,3) for a total of 9. Alt3 has only (1,2,3) for a total of 1. Where did the other 17 dimensions go? For instance, who got λ₃λ₂λ₁?

score 4 · Answer 4 · answered Apr 30 '12 at 10:15

4

Regarding your last question on $\text{tr}^{(3)}$, there is a kind of recursive definition for all $\text{tr}^{(k)}$ given by: $$ \text{tr}^{(k)}=-(1/k)\left(\sum_{j=1}^k (-1)^{j} \text{Tr}(A^j)\text{tr}^{(k-j)}\right). $$

answered Apr 30 '12 at 10:15

draks ...

18,449

Here by $tr^(k)$ do you mean $Tr[Sym^k(A)]$ ? At least in the case of $\Lambda ^k (g)$ I think the identity which holds is exactly the same as above except that the overall "-" is not there and $(-1)^j$ is replaced by $(-1)^{j-1}$..right? – Student Jul 18 '12 at 20:52
@Anirbit Is anything wrong with my answer? Since I'm a not a mathematician, I'm not 100% sure, what $tr[Sym^k(A)]$ means. Maybe you can give me a better defintion then the ones given in the other answers. By "above", you mean the question or one of the answers? In fact, this answer was an excerp of this answer. – draks ... Jul 19 '12 at 17:31
The answer is ok. In the most popular answer to thttp://math.stackexchange.com/questions/137951/symmetric-and-exterior-power-of-representation there is an explanation on $\Lambda^k(g)$ and $Sym^k(g)$. – emiliocba Jan 07 '17 at 00:40

Formulas for the (top) coefficients of the characteristic polynomial of a matrix

4 Answers4

Linked