25

Say I have a hermitian matrix A with elements $A_{ij}$, given the eigenvalues $e_p$, (and optional eigenvectors $v_p$)

Is there an easy(ish) way to calculate $\frac{\partial e_p}{\partial A_{ij}}$?

The elements represent potentials between particles and so any deviations would also leave the matrix hermitian.

[Post some answers edit: Including some degenerate eigenvalues]

byo
  • 353
  • 2
    The eigenvalues are differentiable if they are unique (implicit function will work here). Otherwise there is a generalised gradient (Clarke) that takes a little more effort to compute. – copper.hat Jan 02 '18 at 07:25
  • 2
    Does $\partial A_{ij}$ take into account hermitianity (i.e. are we changing $A_{ji}$ as well), or are we only changing that single entry? – Arthur Jan 02 '18 at 07:26
  • While I'm not amazing at linear algebra or anything I find this doubtful: the eigenvalues are roots of a polynomial, and that means the derivative is going to be nasty at best. – Dan Uznanski Jan 02 '18 at 07:31
  • Yep. Definitely not amazing at linear algebra. Tell us more, @copper.hat! That looks like it could build into an answer. – Dan Uznanski Jan 02 '18 at 07:37
  • They represent (somewhat) a screened interaction between particles). The matrix will always be hermitian if that helps. (Edited question again) – byo Jan 02 '18 at 07:38
  • For Hermitian matrices the eigenvalues are fairly well behaved modulo lack of differentiability with repeated eigenvalues. – copper.hat Jan 02 '18 at 07:39
  • @Arthur: That is a good question. Using the implicit function on $(\lambda,A) \mapsto \det (\lambda I -A)$ shows existence either way, but obviously the answer depends on the 'allowable' perturbations. – copper.hat Jan 02 '18 at 07:44
  • Just stumbled across this, might be useful: https://terrytao.wordpress.com/2010/01/12/254a-notes-3a-eigenvalues-and-sums-of-hermitian-matrices/ – copper.hat Jan 02 '18 at 07:47

2 Answers2

36

You need a convention for how to pick the magnitude of $v_p$; let's say it is normalized to unit length. Then the "standard" approach is to simply implicitly differentiate $Av_p = e_p v_p$: $$(dA)v_p + A(dv_p) = de_p \, v_p + e_p dv_p.$$

Now since $v_p$ is unit length, $v_p \cdot dv_p = \frac{1}{2} d(\|v_p\|^2) = 0$, so that $$v_p^T (dA) v_p + v_p ^T A \, dv_p = de_p + 0.$$ Finally since $A$ is Hermitian, $v_p^TA = e_p v_p^T$, and $$v_p^T (dA) v_p = de_p.$$

Of course, the surprisingly simple $\frac{\partial e_p}{\partial A_{ij}} = (v_p \cdot b_i)(v_p \cdot b_j)$ follows, where $b_i$ is the Euclidean basis vector with a $1$ at entry $i$ and zeroes elsewhere.

Now some caveats: if you dig into what's really going on in the above calculation, we're implicitly assuming that $e_p$ and $v_p$ vary smoothly given a variation of $A$. This is equivalent to the roots of the characteristic polynomial varying smoothly as a function of the coefficients, which is (only) true when the roots are distinct. As copper.hat alludes above, the situation is rather more complicated when $A$ has repeated eigenvalues. Moreover we've assumed explicitly that $A$ is Hermitian; if the variation you're interested in is the Hermitian $dA = b_ib_j^T + b_jb_i^T$, the above formula is different by a factor of two.

D.W.
  • 4,540
user7530
  • 49,280
  • 1
    Might I kindly ask you to elucidate of what should be done when the eigenvalues are degenerate? Or point me to some reading material :) – byo Jan 02 '18 at 07:43
  • They aren't smooth functions on the space of Hermitian matrices at the points where they collide. However, if you restrict to the submanifold where the given collisions happen (a union of conjugacy classes), then they are smooth as functions on that submanifold. – AnonymousCoward Jan 02 '18 at 07:46
  • 1
    @byo See for instance http://www.win.tue.nl/analysis/reports/rana06-33.pdf and https://people.orie.cornell.edu/aslewis/publications/99-clarke.pdf (perhaps copper.hat has more information for you as well) – user7530 Jan 02 '18 at 07:47
  • 3
    @byo As a practical matter, if you are doing a numerical simulation, you are in one of two situations: (1) you know you expect repeated eigenvalues, because of symmetries in your problem: either break the symmetry (by e.g. perturbing the positions of your particles) or choose reduced coordinates that factor out the symmetry; (2) you don't expect repeated eigenvalues, so you cross fingers and trust in the fact that the "typical" Hermitian matrix has distinct eigenvalues. – user7530 Jan 02 '18 at 08:09
  • Can you explain the last sentence in a bit more detail? What does the derivative look like if $A$ is not Hermitiation? What does $dA = b_i b_j^T + b_j b_i^T$ represent? (I don't quite understand where that comes from or why it depends on $i,j$.) Is there any chance you might be willing to expand on that case? Should I ask a separate question about how to handle non-Hermitian matrices? – D.W. Apr 30 '18 at 03:34
  • @user7530: Could you tell me why when $M$ is a psd matrix, we have $\frac{\partial \lambda_i(M)}{\partial M}=v_iv_i^T$, where $(\lambda_i(M),v_i)$ are eigenvalue and eigen vectors of $M$. –  Nov 01 '18 at 20:54
  • @D.W. If you change only one of the entries of $A$, then $A$ stops being Hermitian. What I mean at the end is that you may want to instead simultaneously change the $ij$ and $ji$ entry. – user7530 Nov 02 '18 at 01:55
  • @Saeed I'm not sure what you're asking---my answer derives exactly that identity. – user7530 Nov 02 '18 at 01:56
  • @user7530: In the following question, the person who answered the question, uses your answer, I am wondering what will happen when eigenvalues are not distinct https://math.stackexchange.com/questions/2980940/what-is-the-kkt-condition-for-constraint-m-preceq-i –  Nov 05 '18 at 02:58
  • @Saeed nothing simple. I posted some links to papers on this topic earlier in the comments. – user7530 Nov 05 '18 at 06:51
  • This answer and comments by @AnonymousCoward suggest that eigenvalues lose smoothness when they are repeated. I believe this is a false statement. One might experience issues with defining repeated eigenvalues properly, but these are the issues of a specific analytical or computational procedure. This does preclude a possibility of defining eigenvalues as smooth functions even if they "collide", as the parameters vary. – paperskilltrees Feb 11 '21 at 04:46
  • It seems that there is always a way to define eigenvalues as smooth functions even if they are repeated. The cited van der Aa's paper provides "a method to compute first order derivatives of the eigenvalues and eigenvectors for a general complex-valued, non-defective matrix". Further, in his comment Terry Tao says "a theorem of Rellich ... that eigenvectors can ... be selected in an analytic fashion [even when repeated]" – paperskilltrees Feb 11 '21 at 04:53
  • A highly relevant question and discussion on MathOverflow: how to find/define eigenvectors as a continuous function of matrix? – paperskilltrees Feb 11 '21 at 05:05
4

Side note: these functions come up in Hamiltonian geometry where they are coordinates of a moment map for a torus action on (a subset of) the Poisson manifold of Hermitian matrices (which can be identified with the dual Lie algebra of $U(n)$).

First do it for a diagonal hermitian matrix $D$ with distinct diagonal entries. There are two ways to move in the space of Hermitian matrices (i.e. the tangent space at a diagonal matrix decomposes as a sum of two subspaces):

  • translate by another diagonal matrix $D_t = D+ tD'$. It's easy to see how this changes the eigenvalues.
  • Act by conjugation by a unitary matrix: $A_t = e^{tX}De^{-tX}$. The eigenvalues are constant under this action, so the derivatives of the eigenvalues are zero in these directions.

Now since every Hermitian matrix can be diagonalized, you can use this to answer the question for all Hermitian matrices.

For degenerate eigenvalues, think of it this way. Either you have distinct eigenvalues $\lambda_1 > \ldots > \lambda_n$ or you have some equalities. Suppose we have a Hermitian matrix $A$ with some equalities, e.g. $\lambda_1 = \lambda_2 > \lambda_3$. The function $\lambda_1$ is not smooth on the set of Hermitian matrices at $A$. However, we can restrict to the subset of hermitian matrices whose eigenvalues have the same equalities (e.g. consider all hermitian matrices with eigenvalues $\lambda_1 = \lambda_2 > \lambda_3$, which is diffeomorphic to $\left(U(3)/U(2)\times U(1)\right)\times \mathbb{R}^2$). This subset is a smooth submanifold of the set of Hermitian matrices, so we can compute derivatives of functions on it.

The same trick as above can be applied again on this submanifold, with the exception that in the first part, when translating by another diagonal matrix, you can only translate by diagonal matrices that have the same "pattern" of equalities in their diagonal entries.