13

How does one evaluate the second derivative of the determinant of a square matrix?

Jacobi's formula tells us how to evaluate the first derivative but I can't find anything for the second. This is my attempt:


We can start using the partial derivative formulation of Jacobi's formula, assuming A is invertible: \begin{equation}\frac{\partial}{\partial \alpha}\det A = (\det A) \text{tr}\left( A^{-1} \frac{\partial}{\partial \alpha} A \right)\end{equation} Taking a second derivative: \begin{equation}\frac{\partial^2}{\partial \alpha^2}\det A = \frac{\partial}{\partial \alpha}\left[(\det A) \text{tr}\left( A^{-1} \frac{\partial}{\partial \alpha} A \right)\right]\end{equation} Applying product rule: \begin{equation}= \frac{\partial}{\partial \alpha}(\det A) \cdot \text{tr}\left( A^{-1} \frac{\partial}{\partial \alpha} A \right) + (\det A) \frac{\partial}{\partial \alpha} \text{tr}\left( A^{-1} \frac{\partial}{\partial \alpha} A \right)\end{equation} Replacing $\frac{\partial}{\partial\alpha}\det A$ with Jacobi's formula: \begin{equation}= (\det A) \text{tr}\left( A^{-1} \frac{\partial}{\partial \alpha} A \right) \cdot \text{tr}\left( A^{-1} \frac{\partial}{\partial \alpha} A \right) +(\det A) \text{tr}\left( \frac{\partial}{\partial \alpha}\left(A^{-1} \frac{\partial}{\partial \alpha} A \right)\right)\end{equation} Factoring out $\det(A)$: \begin{equation}= (\det A) \left[\text{tr}^2\left( A^{-1} \frac{\partial}{\partial \alpha} A \right) + \text{tr}\left( \frac{\partial}{\partial \alpha}\left(A^{-1} \frac{\partial}{\partial \alpha} A \right)\right)\right]\end{equation} Another product rule: \begin{equation}= (\det A) \left[\text{tr}^2\left( A^{-1} \frac{\partial}{\partial \alpha} A \right) + \text{tr}\left(\frac{\partial}{\partial \alpha}A^{-1} \frac{\partial}{\partial \alpha} A + A^{-1} \frac{\partial^2}{\partial \alpha^2} A \right)\right]\end{equation} Finally, using $A_{\alpha}$ to denote the partial of A wrt to $\alpha$ we have \begin{equation}\frac{\partial^2}{\partial \alpha^2}\det A= \det(A) \left[\text{tr}^2\left( A^{-1} A_{\alpha} \right) + \text{tr}\left(A^{-1}_{\alpha} A_{\alpha}\right) + \text{tr} \left( A^{-1} A_{\alpha^2} \right)\right]\end{equation} The second trace actually reduces to N, for an NxN matrix: \begin{equation}\text{tr}\left(A^{-1}_{\alpha} A_{\alpha}\right)=\text{tr}(I)\end{equation} \begin{equation}=\sum_{diagonals}1\end{equation} \begin{equation}=N\end{equation}

\begin{equation}\therefore\frac{\partial^2}{\partial \alpha^2}\det A= \det(A) \left[\text{tr}^2\left( A^{-1} A_{\alpha} \right) + \text{tr} \left( A^{-1} A_{\alpha^2} \right)+N\right]\end{equation}


I'm not overly confident with the matrix calculus, but the result is nice enough to seem plausible. Is there another method, or is this proof valid? Thanks!

Palfore
  • 131

3 Answers3

6

As pointed out by Carl, your mistake is to permute the inverse and derivative operator:

$$\partial_\alpha(A^{-1}) \neq (\partial_\alpha A)^{-1} \, .$$

You can easily generalize it for whatever two differentiating arguments ($\alpha$ and $\beta$). Starting from Jacobi's formula:

$$ \partial_{\alpha}(\mathrm{det}(A)) = \mathrm{det}(A) \, \mathrm{tr}\left(A^{-1} \, \partial_\alpha A\right) \, $$

applying on it the chain rule:

$$ \partial_{\alpha\beta}(\mathrm{det}(A)) = \partial_\beta(\mathrm{det}(A)) \, \mathrm{tr}\left(A^{-1} \, \partial_\alpha A\right) + \mathrm{det}(A) \, \mathrm{tr}\left(\partial_\beta \left(A^{-1} \, \partial_\alpha A\right)\right) \, ,$$

using Jacobi's formula on the first $\beta$ derivative and applying the chain rule on the second:

$$ \partial_{\alpha\beta}(\mathrm{det}(A)) = \mathrm{det}(A) \, \mathrm{tr}\left(A^{-1} \, \partial_\beta A\right)\,\mathrm{tr}\left(A^{-1} \, \partial_\alpha A\right) + \mathrm{det}(A) \, \mathrm{tr}\left(\partial_\beta \left(A^{-1}\right) \, \partial_\alpha A\right) + \mathrm{det}(A) \, \mathrm{tr}\left(A^{-1} \, \partial_{\alpha\beta} A\right)\, ,$$

and finally, factoring out the determinant and applying the relation for the inverse $\partial_\beta(A^{-1}) = -A^{-1} \, \partial_\beta A \, A^{-1}$:

$$ \partial_{\alpha\beta}(\mathrm{det}(A)) = \mathrm{det}(A) \, \left[\mathrm{tr}\left(A^{-1} \, \partial_\beta A\right)\,\mathrm{tr}\left(A^{-1} \, \partial_\alpha A\right) - \mathrm{tr}\left(A^{-1} \, \partial_\beta A \, A^{-1} \, \partial_\alpha A\right) + \mathrm{tr}\left(A^{-1} \, \partial_{\alpha\beta} A\right)\right] \, , $$

which is the same as what Carl gave for $\beta=\alpha$:

$$ \partial_{\alpha\alpha}(\mathrm{det}(A)) = \mathrm{det}(A) \, \left[\left(\mathrm{tr}\left(A^{-1} \, \partial_\alpha A\right)\right)^2 - \mathrm{tr}\left(A^{-1} \, \partial_\alpha A \, A^{-1} \, \partial_\alpha A\right) + \mathrm{tr}\left(A^{-1} \, \partial_{\alpha\alpha} A\right)\right] \, . $$

5

I think your derivation is good up until you take the derivative of $A^{-1}$. From formula (40) of the Matrix Cookbook the derivative is:

$$\partial \left(X^{-1}\right) = - X^{-1} \left( \partial X \right) X^{-1}$$

so instead of a second term of $\operatorname{tr} \left( A_\alpha^{-1} A_\alpha \right)$ it should instead be $-\operatorname{tr} \left( A^{-1} A_\alpha A^{-1} A_\alpha \right)$. The final result then is:

$$\det \left(A\right) \left( \left(\operatorname{tr}\left(A^{-1} A_\alpha\right)\right)^2 + \operatorname{tr}\left(A^{-1} A_{\alpha\alpha}\right)-\operatorname{tr}\left(A^{-1}A_\alpha A^{-1} A_\alpha\right)\right)$$

It is a simple matter to confirm this formula symbolically for small examples in your CAS of choice (for a Mathematica implementation see my answer to a similar question on MSE).

Carl Woll
  • 596
4

To keep things simple, assume $A=A(t)$ is a function of a parameter $t$, and we are after the derivatives at zero. Also assume that $A(t)$ is smooth enough in $t$ and that $A(0)$ is invertible. While we are at it, let's replace $A(t)$ by $B(t)=A(0)^{-1}A(t)$ so that $B(0)=I$.

Then $$B(t)=I+tB'(0)+\frac{t^2}2B''(0)+\cdots$$ for some matrices $B_1$ and $B_2$. We now expand $\det B(t)$ by the formula in terms of permutations and their signs. A permutation moving three or more points will lead to a term with a factor of $t^3$ which we can neglect. The identity permutation yields the product of the diagonal elements of $B(t)$ which is \begin{align} &1+t\sum_{i=1}^n B'(0)_{ii} +\frac{t^2}2\sum_{i=1}^n B''(0)_{ii} +t^2\sum_{1\le i<j\le n}B'(0)_{ii}B'(0)_{jj}+\cdots\\ &=1+t\text{ tr}\,B'(0)+\frac{t^2}2\text{ tr}\,B''(0) +t^2\sum_{1\le i<j\le n}B'(0)_{ii}B'(0)_{jj}+\cdots \end{align} The only other permutations counting are the involutions, which together contribute $$-t^2\sum_{1\le i<j\le n}B'(0)_{ij}B'(0)_{ji}+\cdots.$$

The second derivative of $\det B(t)$ at zero is therefore $$\text{tr}\,B''(0)+2\sum_{1\le i<j\le n} (B'(0)_{ii}B'(0)_{jj}-B'(0)_{ij}B'(0)_{ji}).$$ The sum here is the "second trace" of $B'(0)$, the coefficient of $X^{n-2}$ in its characteristic polynomial.

Now if we like we can write $A(t)=A(0)B(t)$ and get a formula for the second derivative of $\det A(t)$.

Angina Seng
  • 158,341