4

I have the matrix $A\in \mathbb{R}^{n\times n}$. If $\lambda$ is an eigenvalue of it with a algebraic multiplicity of 1 how can I find the derivative of this eigenvalue $\lambda$ w.r.t. the matrix items of $A$?


Lets try this:

$F(A,\lambda)=det(A-\lambda I)=0$

$\frac{\partial F}{\partial a_{ij}}=adj(A-\lambda I)_{ji}+\frac{\partial F}{\partial \lambda}\cdot \frac{\partial \lambda}{\partial a_{ij}}=0$

The right part is what I need to find. But I don't see how

Averroes2
  • 1,147

2 Answers2

5

Jacobi's formula for the derivative of the determinant is

$$\frac{\mathrm{d}}{\mathrm{d} t}\mathrm{det}\left(A(t)\right)=\mathrm{tr}\left(\mathrm{adj}(A(t))\frac{\mathrm{d}}{\mathrm{d} t}A(t)\right)$$

where $\mathrm{adj}(A)$ is the adjugate of $A$. The quantity $\mathrm{det}(A-\lambda I)$ depends on the entry $A_{ij}$ both via $A$ and via $\lambda$. So by the chain rule we have

$$\frac{\mathrm{d}}{\mathrm{d} A_{ij}}\mathrm{det}(A-\lambda I)=\frac{\partial}{\partial A_{ij}}\mathrm{det}(A-\lambda I)+\frac{\mathrm{d}\lambda}{\mathrm{d}A_{ij}}\frac{\partial}{\partial \lambda}\mathrm{det}(A-\lambda I)$$ $$=\mathrm{tr}\left(\mathrm{adj}(A-\lambda I)E_{ij}\right)+\frac{\mathrm{d}\lambda}{\mathrm{d}A_{ij}}\mathrm{tr}\left(\mathrm{adj}(A-\lambda I)(-I)\right)$$ $$=\mathrm{adj}(A-\lambda I)^T_{ij}-\frac{\mathrm{d}\lambda}{\mathrm{d}A_{ij}}\mathrm{tr}\left(\mathrm{adj}(A-\lambda I)\right)$$

(where $E_{ij}$ is the matrix with a $1$ in the $(i,j)$ position, and $0$s elsewhere). But since $\lambda$ is an eigenvalue, the quantity $\frac{\mathrm{d}}{\mathrm{d} A_{ij}}\mathrm{det}(A-\lambda I)$ is zero. So

$$\mathrm{adj}(A-\lambda I)^T_{ij}-\frac{\mathrm{d}\lambda}{\mathrm{d}A_{ij}}\mathrm{tr}\left(\mathrm{adj}(A-\lambda I)\right)=0$$

and hence

$$\frac{\mathrm{d}\lambda}{\mathrm{d}A_{ij}}=\frac{\mathrm{adj}(A-\lambda I)^T_{ij}}{\mathrm{tr}\left(\mathrm{adj}(A-\lambda I)\right)}$$


[The following might be useful: Note that if a matrix has eigenvalues $\lambda_i$ for $i=1,\dots,n$ then its adjugate has eigenvalues $\prod_{j\neq i}\lambda_j$ for $i=1,\dots,n$. Since one of the eigenvalues of $A-\lambda I$ is zero, all but one of these eigenvalues of $\mathrm{adj}(A-\lambda I)$ is therefore also zero, and we have

$$\mathrm{tr}\left(\mathrm{adj}(A-\lambda I)\right)=\prod(\lambda_j-\lambda)$$

where the product is over $A$'s other eigenvalues.]

  • thanks, but where did you use that the algebraic multiplicity is 1? – Averroes2 Mar 20 '17 at 21:18
  • @Averroes2 I'm not really using that information except that if the algebraic multiplicity isn't $1$ then you can't think of $\lambda$ as an implicit function of $A_{ij}$. So the question doesn't even make sense in that case because you can't take the derivative. – Oscar Cunningham Mar 20 '17 at 21:21
  • Ok, understand. One more thing: How you use the chain rule to get your second equation? Do you make use of the product rule as well? – Averroes2 Mar 20 '17 at 21:25
  • You already did that bit in your question! When you write $\frac{\partial F}{\partial a_{ij}}=adj(A-\lambda I){ji}+\frac{\partial F}{\partial \lambda}\cdot \frac{\partial \lambda}{\partial a{ij}}=0$ you've already done the bit that I described as the chain rule. I just used Jacobi's formula to evaluate $\frac{\partial F}{\partial \lambda}$ as $\mathrm{tr}(\mathrm{adj}(A-\lambda I)(-I))$. – Oscar Cunningham Mar 20 '17 at 21:32
  • yes, but again: how do you use the chain rule? I thought I have the product rule here. Maybe I understood something wrong... – Averroes2 Mar 20 '17 at 21:41
  • @Averroes2 If $f$ is a function of $y$ and $z$, i.e. $f(y,z)$, and $y$ and $z$ depend on $x$, i.e. $y(x)$ and $z(x)$, then $\frac{\mathrm{d}}{\mathrm{d}x}f(y(x),z(x))=\frac{\mathrm{d}y}{\mathrm{d}x}\frac{\partial}{\partial y}f+\frac{\mathrm{d}z}{\mathrm{d}x}\frac{\partial}{\partial z}f$. See https://en.wikipedia.org/wiki/Chain_rule#Higher_dimensions. Except in this case $y(x)$ is the identity [i.e. $F(A_{ij},\lambda)=F(A_{ij},\lambda(A_{ij}))$] so we have $\frac{\mathrm{d}}{\mathrm{d}x}f(x,z(x))=\frac{\partial}{\partial x}f+\frac{\mathrm{d}z}{\mathrm{d}x}\frac{\partial}{\partial z}f$ – Oscar Cunningham Mar 20 '17 at 22:00
2

There are implicitly two questions here. The first is how the coefficients of the characteristic polynomial depend on the matrix elements of $A$, and the second is how the roots of a polynomial depend on its coefficients.

It's not a very pleasant calculation. Let $A$ be some square matrix and let $\Lambda^kA$ denote the $k$th exterior power of $A$. Explicitly, $\Lambda^kA$ is the $\binom{n}{k}\times\binom{n}{k}$ matrix whose $ij$th entry is the $ij$th order $k$ minor of $A$. Then we may explicitly write the characteristic polynomial as $$p(x) = \det(A-\lambda I) = \sum_{k=0}^n(-1)^k\lambda^k\mathrm{Tr}(\Lambda^{n-k}A).$$ Explicitly, the coefficient of $\lambda^k$ in the characteristic polynomial is $$c_k = (-1)^k\mathrm{Tr}(\Lambda^{n-k}A),$$ which is, up to a sign, the sum of all principal minors of order $k$. Not pleasant to compute, but doable, and this will directly give the coefficients of $c_n$ as a function of the matrix elements of $A$. The corresponding derivatives can then be computed using Jacobi's formula.

Now, suppose we have some polynomial $$p(x) = c_nx^n + c_{n-1}x + \cdots + c_1x+ c_0.$$ If $x_0$ were a root, then $$c_nx_0^n + c_{n-1}x_0 + \cdots + c_1x_0+ c_0 = 0.$$ Implicitly differentiating $x$ with respect to $c_k$, we find that $$0 = nc_nx_0^{n-1}\frac{\partial x_0}{\partial c_k} + (n-1)c_{n-1}x_0^{n-2}\frac{\partial x_0}{\partial c_k} + \cdots + c_1\frac{\partial x_0}{\partial c_k} + x_0^k,$$ so that $$\frac{\partial x_0}{\partial c_k} = - \frac{x^k}{p'(x_0)}.$$

If we put all of these elements together, then the derivative of an eigenvalue $\lambda$ with respect to the matrix element $a_{ij}$ is given by $$\frac{\partial \lambda}{\partial a_{ij}} = \sum_{k=0}^n \frac{\partial \lambda}{\partial c_k}\frac{\partial c_k}{\partial{a_{ij}}}=-\frac{1}{p'(\lambda)}\sum_{k=0}^n(-1)^k\lambda^k\frac{\partial}{\partial a_{ij}}\mathrm{Tr}(\Lambda^{n-k}A),$$ where $p'(\lambda)$ is the derivative of the characteristic polynomial evaluated $\lambda$. This is probably not the most computationally efficient way of doing things, but it shows that the derivative can be computed essentially by calculating the minors of $A$.

EuYu
  • 41,421