10

How to calculate the derivative with respect to $X$ of: $$ \log \mathrm{det}\, X $$ here $X$ is a positive definite matrix, and det is the determinant of a matrix.

How to calculate this? Thanks!

I know it's a classical problem, but I can't find some clear material from the Internet. So some good reference is also very helpful!


The hardness for me to understand is that the domain of $X$ is confined to be $S^n$. Therefore, for each symmetric matrix $X$, a specific $n(n+1)/2$-dimension vector would represent it. But the result is $X^{-1}$ (if I remember it right), a matrix form with $n^2$ elements. How to interpret the matrix form result?

zxzx179
  • 1,507

1 Answers1

8

If $X$ is invertible, then $D \det (X) (H) = (\det X) \operatorname{tr} (X^{-1} H)$.

If $\phi = \log \circ \det$, then $D \phi(X)(H) = {1 \over \det X}(\det X) \operatorname{tr} (X^{-1} H) = \operatorname{tr} (X^{-1} H)$.

Note that using the Frobenius norm, this gives $\nabla \phi(X) = X^{-T}$.

copper.hat
  • 172,524
  • What is $H$ here? The question doesn't have any $H$ in it. Can you clarify this notation? – Sohail Si Oct 16 '17 at 15:40
  • The derivative is a linear operator, $H$ is the parameter to which the operator is applied. – copper.hat Oct 16 '17 at 15:50
  • @SohailSi: The derivative is often written as ${\partial f(x) \over \partial x}$ and applied to a 'perturbation' $h$ by writing ${\partial f(x) \over \partial x} h$, however this suggests that it can be written as a matrix multiplication which is often valid, but in the case of matrix derivatives it can be misleading. So, I use the notation $Df(x)(h)$ in an attempt to avoid the confusion. – copper.hat Oct 16 '17 at 16:40
  • Thank you. Is there a textbook that explains (uses) this notation? So, my understanding is: $D f (X) (H) = Z$ is similar to $ d f(X) = Z dH$. Is this correct? – Sohail Si Oct 17 '17 at 15:06
  • 1
    The first time I saw this notation was in Marsden's "Elementary classical analysis". Many authors write $Df(X)H$, but I am leery of this as it confuses application with matrix multiplication. For example, take the operator $L(X) = A X B$ where $A,X,B$ are matrices of appropriate dimensions. Then $DL(X)(H) = AHB$ whereas if we write $DL(X)H$ it looks like a matrix multiplication when the $H$ should be in the 'inside'. – copper.hat Oct 17 '17 at 16:10
  • relevant? Prove $\frac{\partial \rm{ln}|X|}{\partial X} = 2X^{-1} - \rm{diag}(X^{-1})$.. Here I say 'We first note that for the case where the elements of X are independent, a constructive proof involving cofactor expansion and adjoint matrices can be made to show that $\frac{\partial ln|X|}{\partial X} = X^{-T}$ (Harville). This is not always equal to $2X^{-1}-diag(X^{-1})$. The fact alone that X is positive definite is sufficient to conclude that X is symmetric and thus its elements are not independent.' – BCLC Apr 16 '21 at 10:00