I'm working on an expression for the nth derivative of a (symmetric) matrix, i.e. \begin{equation}\frac{\partial^{n} \det(A)}{\partial A^{n}}\end{equation} Starting with \begin{equation}\frac{\partial \det(A)}{\partial A}=\det(A) A^{-1}\end{equation} Then naturally the next derivative is \begin{equation}\frac{\partial^{2}\det(A)}{\partial A^{2}}=\frac{\partial}{\partial A}\left(\det(A)A^{-1}\right)=\det(A)A^{-2}-\det(A)A^{-2}=0\end{equation} I doubt this is right, can someone point out my mistake? I'm actually working on an expression for the nth derivative of $\det(A)^{-1/2}$ but a general formula for the simple case would be fine.
-
Derivate $A^{-1}$ with respect t0 $A$ is not $A^{-2}$. It is 4-dimensional object. Thus even $(\det(A)B)^\prime_A \ne \det(A)A^{-1}B$ – Alexander Vigodner Aug 08 '14 at 01:10
2 Answers
Tee-Jay, that you call a derivative is in fact a gradient. Let $f\colon U\mapsto \det(U)$. We assume that $A$ is invertible. The derivative is a linear application:
$Df_A\colon H\mapsto \det(A)\operatorname*{trace}(HA^{-1})$.
The second derivative is a symmetric bilinear form:
$$D^2f_A\colon(H,K)\mapsto \det(A)\operatorname*{trace}(KA^{-1})\operatorname*{trace}(HA^{-1})+\det(A)\operatorname*{trace}(H(-A^{-1}KA^{-1})) \qquad\qquad\qquad\qquad\qquad\quad=\det(A)(\operatorname*{trace}(HA^{-1})\operatorname*{trace}(KA^{-1})-\operatorname*{trace}(HA^{-1}KA^{-1})).$$
In particular, the associated quadratic form is:
$$D^2f_A(H,H)=\det(A)((\operatorname*{trace}(HA^{-1}))^2-\operatorname*{trace}((HA^{-1})^2).$$
EDIT 1: let $(\lambda_i)$ be the spectrum of $HA^{-1}$, $(\sigma_i)$ be the elementary symmetric polynomials associated to the $(\lambda_i)$ and $S_k=\sum_i {\lambda_i}^k=\operatorname*{trace}((HA^{-1})^k)$. Then $$Df_A(H)=\det(A)\sigma_1,D^2f_A(H,H)=\det(A)2\sigma_2.$$ In fact we can generalize this result.
$\det(A+H)=\det(A)\det(I+HA^{-1})$ and we may assume that $A=I$ and $(\lambda_i)$ is the spectrum of $H$. Then, according to the Taylor formula, $$\det(I+H)=\Pi_i(1+\lambda_i)=1+\sum_{k=1}^n\sigma_k=1+\sum_{k=1}^n 1/k!D^kf_I(H,\cdots,H).$$ By identifying the terms of the same degree, we obtain:
$$D^kf_I(H,\cdots,H)=k!\sigma_k=k!P_k(S_1,\cdots,S_k)=k!P_k(\operatorname*{trace}(H),\cdots, \operatorname*{trace}(H^k)$$ where $P_k$ is the polynomial given by the Newton's identities, cf., http://en.wikipedia.org/wiki/Newton_identities
For instance, $$D^1f_I=S_1,D^2f_I={S_1}^2-S_2,D^3f_I={S_1}^3-3S_1S_2+2S_3,\\ D^4f_I={S_1}^4-6{S_1}^2S_2+3{S_2}^2+8S_1S_3-6S_4.$$
EDIT 2: Example. Let $A,H\in GL_n(\mathbb{C})\times M_n(\mathbb{C})$. $$\det(A+H)=\det(A)(1+S_1+1/2({S_1}^2-S_2)+1/6({S_1}^3-3S_1S_2+2S_3)+O(\lVert{H}\rVert^4))$$ where $S_k=\operatorname*{trace}((HA^{-1})^k)$.

- 4,964
-
I'm having trouble understanding your notation here, what is meant by the $H$ and $A$ in your first line? If $f(A)=\det(A)$ if I understand you correctly, then I don't get what $H$ is. – TeeJay Aug 29 '14 at 01:43
-
1$A,H\in M_n(\mathbb{R})$. Yet $\det()$ is a polynomial, then a holomorphic function ; thus one can assume that $A,H\in M_n(\mathbb{C})$. I think that you did not understand one word of my post. Read a book about the differential calculus and stop to read the matrix Cookbook. What do you want to do with the derivatives of $\det()$ ? If you don't give me details so I can do nothing for you. I edit an example for you. – Aug 29 '14 at 10:06
-
It still isn't clear what $H$ is, or for that matter $K$. I'm not confused as to the content but the notation. Your notation isn't clear to me, you seem to be talking about the determinant of a matrix $A$ and its derivatives, and then introduce these other matrices $H$ and $K$ without stating what they represent. – TeeJay Aug 30 '14 at 11:07
-
The $k^{th}$ derivative of $\det()$ is a function that is symmetric in $k$ variables (the matrices $H_1\cdots,H_k\in M_n(\mathbb{C}))$. Fortunately, in the Taylor formula, we take the value of the $k^{th}$ derivative in $(H,\cdots,H)$. – Aug 30 '14 at 15:04
I have to admit this example is pretty interesting. There are a few mistakes here.
At first, the derivative of the determinant of a symmetric matrix w.r.t itself is $$ \frac{\partial}{\partial \mathbf{X}} \det(\mathbf{X}) = \det(\mathbf{X}) \, (2 \mathbf{X}^{-1} - (\mathbf{X}^{-1} \circ \mathbf{I})) $$ (where $\circ$ denotes Hadamard product) is no long the formula you wrote for an invertible matrix with no special structure. The reason can be found in this post.
Second, two derivatives of $\det A$ and $A^{-1}$ with respect to $A$ has totally different interpretations. $$\left(\frac{\partial}{\partial A}\det A\right)_{ij}=\frac{\partial \det A}{\partial a_{ij}}$$ is a matrix composing of different derivatives w.r.t scalar. While $$\frac{\partial}{\partial A} A^{-1}(B)=\lim_{t\to 0}\frac{(A+tB)^{-1}-A^{-1}}{t}=-A^{-1}BA^{-1}\neq-A^{-2}B$$ is Fretchet derivative, or directional derivative along $B$. To use chain rule, you have to unify the definition of derivatives.
But I think the most important thing is that you have to make sure why do you want to take second derivative? Notations serve for mathematics, but mathematics doesn't explain notations.