20

I am given a function $F : \mathbb{R} \to \mathbb{R}$ defined by

$$F(t)=\det(\mathbb{1}+tA)$$

where $A \in \mathbb{R}^{n \times n}$. As far as I know, the following is true.

$$\frac{d}{dt}\bigg|_{t=0} F(t) = \text{tr}~ A$$

However, how to find the second derivative?

$$\frac{d^2}{dt^2}\bigg|_{t=0} F(t)$$

Simon Mueller
  • 471
  • 2
  • 10

7 Answers7

31

Edit: Here's a totally different argument, probably simpler, with more linear algebra and less calculus. Say $\lambda_1,\dots,\lambda_n$ are the eigenvalues of $A$, listed according to (algebraic) multiplicity. Since $\det(I+tA)$ is the product of the eigenvalues of $I+tA$ it follows that $$F(t)=\prod_{j=1}^n(1+t\lambda_j).$$Multiplying that out in the imagination makes it clear that $$F'(0)=\sum\lambda_j=\text{tr}(A),$$and $$F''(0)=2\sum_{j< k}\lambda_j\lambda_k.$$Since the eigenvalues of $A^2$ are $\lambda_j^2$ it follows that $$F''(0)=\left(\sum\lambda_j\right)^2-\sum\lambda_j^2=(\text{tr}(A))^2-\text{tr}(A^2),$$as below.

Update: There's a slightly iffy spot above that nobody seemed to notice. It's easy to see that $\lambda$ is an eigenvalue of $A^2$ if and only if $\lambda=\omega^2$ where $\omega$ is an eigenvalue of $A$, but above we need more than that: We need to know that the algebraic multiplicities are the same. Maybe that's obvious? (Ok, it's clear from the Jordan form, probably clear just from the fact that the algebraic multiplicity is the dimension of the "generalized eigenspace", but it should be easier than that...)

Cheap trick: It's clear if the eigenvalues are distinct. And matricies with distinct eigenvalues are dense, hence the trace of $A^2$ is $\sum\lambda_j^2$ for every $A$.

Or, in better taste: Since $\det(\lambda I-A)=\prod(\lambda-\lambda_j)$, $$\begin{align}\det(\lambda^2I-A^2)&=\det(\lambda I-A)\det(\lambda I+A) \\&=(-1)^n\prod(\lambda-\lambda_j)\prod(-\lambda-\lambda_j) \\&=\prod(\lambda^2-\lambda_j^2).\end{align}$$ Hence $\det(\lambda I-A^2)=\prod(\lambda-\lambda_j^2)$.

Thought of this because I was bothered by the fact that the expression for $F(t)$ derived below doesn't look like a polynomial:

Exercise Use the fact that $\text{tr}(A^k)=\sum_j\lambda_j^k$ to show that $\exp\left(t\,\text{tr}A-\frac{t^2}2\text{tr}(A^2)+\frac{t^3}3\text{tr}(A^3)+\dots\right)=\prod(1+t\lambda_j)$ (for small $t$).

Original: Yes, $F'(0)=\text{tr} A$.

Say $B(t)=I+tA$. Note that $B(t)$ is invertible if $t$ is small enough. So if $t$ and $h$ are both small then $$F(t+h)=\det(B(t)+hA) =\det(B(t))\det(I+hB(t)^{-1}A).$$Taking the derivative with respect to $h$ shows that $$\begin{align}F'(t)&=\det(B(t))\text{tr}(B(t)^{-1}A) \\&=F(t)\text{tr}((I-tA+t^2A^2-t^3A^3\dots)A) \\&=(1+t\,\text{tr}A+\dots)(\text{tr}A-t\,\text{tr}(A^2)+\dots).\end{align}$$Hence $$F''(0)=(\text{tr}A)^2-\text{tr}(A^2).$$

Bonus: There's a differential equation above, saying that $$F'/F=\text{tr}A-t\,\text{tr}(A^2)+\dots.$$With the initial condition $F(0)=1$ this shows that $$F(t)=\exp\left(t\,\text{tr}A-\frac{t^2}2\text{tr}(A^2)+\frac{t^3}3\text{tr}(A^3)+\dots\right),$$which should allow you to find as many derivatives as you want.

  • Using the logarithmic derivative could simpify the initial differentiation here maybe.. – pshmath0 Jun 29 '18 at 05:49
  • David, "$λ$ is an eigenvalue of $A$ if and only if $λ^2$ is an eigenvalue of $A^2$"; are you sure ? –  Jun 30 '18 at 10:38
  • @loupblanc Aargh. Of course there's a true fact there, but the way II phrased it is nonsense - thanks. – David C. Ullrich Jun 30 '18 at 13:27
  • @DavidC.Ullrich: Nice solution. I have a different solution in terms of well known results on linear ODEs. It more closely related to you bonus section. – Mittens Apr 08 '21 at 19:50
8

The following is inspired by David C. Ullrich's answer, namely its bonus section. Recall that $$\color{blue}{\det \big( \exp(\mathrm M) \big) = \exp \big( \mbox{tr} (\mathrm M) \big)}$$


$$\begin{aligned} f (t) := \det ( \mathrm I_n + t \mathrm A ) &= \det \big( \exp \big( \ln \big( \mathrm I_n + t \mathrm A \big) \big) \big)\\ &= \exp \big( \underbrace{\mbox{tr} \big( \ln \big( \mathrm I_n + t \mathrm A \big) \big)}_{=: g (t)} \big) = \exp \big( g (t) \big) \end{aligned}$$

From the Maclaurin expansion of $\ln (1+x)$, we obtain

$$\begin{aligned} g (t) &= \mbox{tr} \big( t \mathrm A - \frac{t^2}{2} \mathrm A^2 + \frac{t^3}{3} \mathrm A^3 - \cdots \big)\\ &= t \,\mbox{tr} (\mathrm A) - \frac{t^2}{2} \mbox{tr} (\mathrm A^2) + \frac{t^3}{3} \mbox{tr} (\mathrm A^3) - \cdots\end{aligned}$$

and

$$g ' (t) = \mbox{tr} (\mathrm A) - t \, \mbox{tr} (\mathrm A^2) + t^2 \, \mbox{tr} (\mathrm A^3) - \cdots$$

and

$$g '' (t) = - \mbox{tr} (\mathrm A^2) + 2t \, \mbox{tr} (\mathrm A^3) - \cdots$$

Differentiating $f$ twice,

$$f '' (t) = \big( g''(t) + \left( g'(t) \right)^2 \big) \, f(t)$$

and, thus,

$$f''(0) = \big ( - \mbox{tr} (\mathrm A^2) + \left( \mbox{tr} (\mathrm A) \right)^2 \big) \, \underbrace{f(0)}_{=1} = \color{blue}{\big( \mbox{tr} (\mathrm A) \big)^2 - \mbox{tr} \left( \mathrm A^2 \right)}$$

4

If in $A(t)=(a_{ij}(t))_{1\leq i,j \leq n}$ each of the components is a $k$ times differentiable real-valued function, then we have (see e.g. this link): \begin{align*} \frac{d^{k}}{dt^k} \begin{vmatrix} a_{11}(t)&a_{12}(t)&\cdots&a_{1n}(t)\\ a_{21}(t)&a_{22}(t)&\cdots&a_{2n}(t)\\ \vdots&\vdots&&\vdots\\ a_{n1}(t)&a_{n2}(t)&\cdots&a_{nn}(t)\\ \end{vmatrix} =\sum_{{k_1+k_2+\cdots+k_n=k}\atop{k_j\geq 0, 1\leq j\leq n}}\frac{k!}{k_1!k_2!\ldots k_n!} \begin{vmatrix} a_{11}^{(k_1)}(t)&a_{12}^{(k_1)}(t)&\cdots&a_{1n}^{(k_1)}(t)\\ a_{21}^{(k_2)}(t)&a_{22}^{(k_2)}(t)&\cdots&a_{2n}^{(k_2)}(t)\\ \vdots&\vdots&&\vdots\\ a_{n1}^{(k_n)}(t)&a_{n2}^{(k_n)}(t)&\cdots&a_{nn}^{(k_n)}(t)\\ \end{vmatrix} \end{align*}

Here we have the special case $n=2$ and $A(t)=1+tA$ with $A$ an $n\times n$ constant matrix. We obtain \begin{align*} &\color{blue}{\frac{d^{2}}{dt^2} \begin{vmatrix} 1+a_{11}t&a_{12}t&\cdots&a_{1n}t\\ a_{21}t&1+a_{22}t&\cdots&a_{2n}t\\ \vdots&\vdots&&\vdots\\ a_{n1}t&a_{n2}t&\cdots&1+a_{nn}t\\ \end{vmatrix}}\\ &\qquad\quad=\sum_{{k_1+k_2+\cdots+k_n=2}\atop{k_j\geq 0, 1\leq j\leq n}}\frac{2!}{k_1!k_2!\ldots k_n!} \begin{vmatrix} (1+a_{11}t)^{(k_1)}&(a_{12}t)^{(k_1)}&\cdots&(a_{1n}t)^{(k_1)}\\ (a_{21}t)^{(k_2)}&(1+a_{22}t)^{(k_2)}&\cdots&(a_{2n}t)^{(k_2)}\\ \vdots&\vdots&&\vdots\\ (a_{n1}t)^{(k_n)}&(a_{n2}t)^{(k_n)}&\cdots&(1+a_{nn}t)^{(k_n)}\\ \end{vmatrix}\\ &\qquad\quad\color{blue}{=2\sum_{{k_1+k_2+\cdots+k_n=2}\atop{0\leq k_j\leq 1, 1\leq j\leq n}}\frac{1}{k_1!k_2!\ldots k_n!} \begin{vmatrix} (1+a_{11}t)^{(k_1)}&(a_{12}t)^{(k_1)}&\cdots&(a_{1n}t)^{(k_1)}\\ (a_{21}t)^{(k_2)}&(1+ta_{22})^{(k_2)}&\cdots&(a_{2n}t)^{(k_2)}\\ \vdots&\vdots&&\vdots\\ (a_{n1}t)^{(k_n)}&(a_{n2}t)^{(k_n)}&\cdots&(1+a_{nn}t)^{(k_n)}\\ \end{vmatrix}} \end{align*} Observe the simplification in the last step in the index range, where we set $0\leq k_j\leq 1$. Since if there would be an index $k_j=2$ the corresponding second derivative produces zeros in the $j$-th row, so that the determinant evaluates to $0$.

Example: $n=2$

Let's do the calculation for small $n=2$. We get \begin{align*} \frac{d^{2}}{dt^2} \begin{vmatrix} 1+a_{11}t&a_{12}t\\ a_{21}t&1+a_{22}t\\ \end{vmatrix} &=2\sum_{{k_1+k_2=2}\atop{0\leq k_j\leq 1, 1\leq j\leq 2}}\frac{1}{k_1!k_2!} \begin{vmatrix} (1+a_{11}t)^{(k_1)}&(a_{12}t)^{(k_1)}\\ (a_{21}t)^{(k_2)}&(1+a_{22}t)^{(k_2)}\\ \end{vmatrix}\\ &=2 \begin{vmatrix} (1+a_{11}t)^{(1)}&(a_{12}t)^{(1)}\\ (a_{21}t)^{(1)}&(1+a_{22}t)^{(1)}\\ \end{vmatrix}\\ &=2 \begin{vmatrix} a_{11}&a_{12}\\ a_{21}&a_{22}\\ \end{vmatrix}\\ &=2(a_{11}a_{22}-a_{12}a_{21})\tag{1} \end{align*}

On the other hand we have \begin{align*} \frac{d^{2}}{dt^2} \begin{vmatrix} 1+a_{11}t&a_{12}t\\ a_{21}t&1+a_{22}t\\ \end{vmatrix} &=\frac{d^{2}}{dt^2}\left[(1+a_{11}t)(1+a_{22}t)-a_{12}a_{21}t^2\right]\\ &=\frac{d^{2}}{dt^2}\left[1+a_{11}t+a_{22}t+a_{11}a_{22}t^2-a_{12}a_{21}t^2\right]\\ &=2(a_{11}a_{22}-a_{12}a_{21}) \end{align*} in accordance with (1).

Note the answer of @DavidCUllrich coincides with the result above \begin{align*} \color{blue}{(\text{tr}A)^2-\text{tr}(A^2)} &=(a_{11}+a_{22})^2-\text{tr}\begin{pmatrix}a_{11}&a_{12}\\a_{21}&a_{22}\end{pmatrix}^2\\ &=(a_{11}+a_{22})^2-\text{tr}\begin{pmatrix}a_{11}^2+a_{12}a_{21}&\ast \\ \ast&a_{12}a_{21}+a_{22}^2\end{pmatrix}\\ &\,\,\color{blue}{=2(a_{11}a_{22}-a_{12}a_{21})} \end{align*}

Botond
  • 11,938
Markus Scheuer
  • 108,315
3

For small values of $t$, define the following matrix variables and their derivatives $$\eqalign{ \def\qiq{\quad\implies\quad} \def\trace{\operatorname{tr}} B &= (I+tA)^{-1} &\qiq \dot B = -BAB \\ C &= AB &\qiq \dot C = -C^2 \\ }$$ The Jacobi formula can be used to calculate the first derivative of your function $$\eqalign{ f &= \det(I+tA) \qiq \dot f &= f\,\trace(AB) = f\,\trace(C) \\ }$$ After which it is easy to calculate the second derivative $$\eqalign{ \ddot f &= \trace(C)\;\dot f \;+\; f\,\trace(\dot C) \\ &= \trace(C)\;f\,\trace(C) \;-\; f\,\trace(C^2) \\ &= f\:\Big[\trace(C)^2 - \trace(C^2)\Big] \\ }$$ and the third derivative $$\eqalign{ \dddot f &= \dot f\:\Big[\trace(C)^2 - \trace(C^2)\Big] + f\:\Big[2\trace(C)\trace(\dot C) - \trace(\dot CC+C\dot C)\Big] \\ &= f\:\Big[\trace(C)^3 - \trace(C)\trace(C^2)\Big] + f\:\Big[-2\trace(C)\trace(C^2) + 2\,\trace(C^3)\Big] \\ &= f\:\Big[\trace(C)^3 - 3\trace(C)\trace(C^2) + 2\,\trace(C^3)\Big] \\ }$$ Higher derivatives can be obtained by elementary differentiation followed by substituting $\dot f$ and $\,\dot C$ in terms of $f$ and $C$. It always reduces to a single $f$ multiplied by a polynomial involving traces of various powers of $C$.

The only remaining task is to evaluate these expressions at $t=0,\,$ which involves substituting the following two quantities $$\eqalign{ \lim_{t\to 0}\:f &= {\tt1}, \qquad\lim_{t\to 0}\:C &= A \\ }$$

greg
  • 35,825
1

Here is another solution based on the known properties of the Wronskian of linear differential equations:


Consider the system \begin{align} \dot{\mathbf{x}}=B(t)\mathbf{x},\qquad \mathbf{x}(0)=\mathbf{x}_0\tag{1}\label{one} \end{align} where $t\mapsto B(t)$ is a map from a neigborhood of $0$ into the space of $n\times n$ matrices in $\mathbb{R}$. Let $\phi(t;\mathbf{y})$ denote the solution to $\eqref{one}$ with $\phi(0;\mathbf{y})=\mathbf{y}$. It is known that Wronskian $W(t)=\operatorname{det}(\partial_{\mathbf{y}}\phi(t;\mathbf{y}))$ satisfies \begin{align} \dot{W}=\operatorname{Trace}(B(t))W,\qquad W(0)=1\tag{2}\label{two} \end{align}


Define $\phi(t;\mathbf{y)}=(I+tA)\mathbf{y}$. Then $\phi$ satisfies the differential equations $$ \dot{\mathbf{x}}=B(t)\mathbf{x},\qquad \mathbf{x}(0)=\mathbf{y} $$ in a small neginborhood around $0$ (small enough so that $(I+tA)$ remains invertible) where $B(t)=A(I+tA)^{-1}$.

It follows that $W(t)=\operatorname{det}(I+tA)$, and by equation \eqref{two}, \begin{align} \ddot{W}&=\operatorname{Trace}(\dot{B}(t))W + \operatorname{Trace}(B(t))\dot{W}\\ &=\operatorname{Trace}(\dot{B}(t))W + \Big(\operatorname{Trace}(B(t))\Big)^2W \end{align} By an application of the chain rule, $\dot{B}(t)=-A\big(I+tA\big)^{-2}$; hence $\dot{B}(0)=-A^2$. Putting things together, we obtain \begin{align} \ddot{W}(0)&=\operatorname{Trace}(\dot{B}(0))W + \Big(\operatorname{Trace}(B(0))\Big)^2W\\ &=- \operatorname{Trace}(A^2) +\Big(\operatorname{Trace}(A)\Big)^2 \end{align}

Mittens
  • 39,145
1

Most of the existing answers make this much too complicated. $f(t) = \det (I + At)$ is a polynomial; differentiating polynomials is easy. Writing $f(t) = \sum_{i=0}^n f_i x^i$ we of course have

$$f^{(k)}(0) = k! f_k$$

for any $k$, so it remains to characterize the coefficients $f_k$ of the characteristic polynomial. Since $f(t) = \prod_{i=0}^n (1 + \lambda_i t)$ we have that $f_k$ is the $k^{th}$ elementary symmetric polynomial in the eigenvalues $\lambda_i$, about which much is known. On the other hand the sequence of traces $\text{tr}(A^k) = \sum \lambda_i^k$ are the power sum symmetric polynomials in the eigenvalues, and much is known about how to relate these to the elementary symmetric polynomials; see the Wikipedia article on Newton's identities, for example.

In matrix terms $f_k$ can also be expressed as the trace $\text{tr}(\Lambda^k A)$ of the $k^{th}$ exterior power of $A$.

Qiaochu Yuan
  • 419,620
0

The computation is done in this old arxiv.org preprint:

Igor Rivin
  • 25,994
  • 1
  • 19
  • 40