4

I'm trying to solve the following matrix derivative : $$\frac{d}{dx}(I + x\Sigma)^{1/2},$$ where $I$ is identity matrix and $\Sigma$ is a constant (positive definite) matrix which is not a function of $x$.

I tried to search for relavent materials, such as wikipedia and matrix cookbook textbook, but I failed to find a formula regarding derivative of a matrix to a power of 1/2.

When $U(x)$ is matrix of a function of $x$, is there a formula of $\frac{d}{dx}U(x)$? If so, how can I induce the formula? I tried to induce it using $\frac{dUV}{dx} = \frac{dU}{dx}V + U\frac{dV}{dx}$, assuming $U,V$ are a function of $x$, but I'm not sure whether this is useful for the desired result.

Any help regarding this question would be grateful. Thank you.

jason 1
  • 727

4 Answers4

4

Write $F(x) = I+x\Sigma$. Compute $F'(x) = \Sigma$. The important observation is that $F(x)$ and $F'(x)$ commute. Since they commute, the chain rule works, and we have $$ \frac{d}{dx} \big(F(x)^s\big) = s F(x)^{s-1} F'(x) $$ In the OP case, we still have to do $(I+x\Sigma)^{-1/2}$ to complete it.

GEdgar
  • 111,679
3

As $\Sigma$ is symmetric positive definite matrix, we can diagonalize the matrix $A$ such that $$\Sigma = P^{-1} D P $$ where

  • $D$ is a diagonal matrix with all positive elements (as $\Sigma$ is positive definite) and
  • $P\in \mathbb{R}^{n\times n}$ and $P^T = P^{-1}$ (as $\Sigma$ is real symmetric)

Then: $$I + x\Sigma = P^{-1}P + P^{-1}(xD)P = P^{T}(I +xD)P = B^TB = B^2 \tag{1}$$ where $B:= \Lambda P$ with

  • $\Lambda \in \mathbb{R}^{n\times n}$, a diagonal matrix with elements equal to $\sqrt{1+xd_i}$ for $i=1,...,n$ for $x\ge \max_{\{i=1,...,n\}}(1/d_i)$.

Remark: more detail for the last equation of $(1)$:

  • As $\Sigma$ is symmetric positive definite matrix, there exists a unique positive definite matrix $F$ such that $F^2 = \Sigma = P^{-1} D P = (D^{1/2}P)^T (PD^{1/2})$. From the uniqueness of $F$, we must have : $F = D^{1/2}P$. We deduce that $P = D^{-1/2}F$ is positive definite matrix (as it is a product of two positive definite matrices).
  • The diagonal matrix $(I +xD)$ is positive definite if and only if $x \ge \max_i{1/d_i}$. In order that $(I +x\Sigma)$ can have square-root, it is necessary and sufficient that $(I +xD)$ is positive definite, that is $x\ge \max_{\{i=1,...,n\}}(1/d_i)$. If this latter condition is satisfied, we can write $(I +xD) = ((I +xD)^{1/2}P)^T(\underbrace{(I +xD)^{1/2}}_{:=\Lambda}P) $ and then we must have $B = \Lambda P$.

Finally, $$\frac{d}{dx}\left(I + x\Sigma \right)^{1/2} = \frac{d}{dx}B = \left(\frac{d}{dx}\Lambda \right)P =\Gamma\cdot P$$ where $\Gamma\in \mathbb{R}^{n\times n}$, a diagonal matrix with elements equal to $\frac{d}{dx}\left(\sqrt{1+xd_i}\right) = \frac{1}{2\sqrt{1+xd_i}}$ for $i=1,...,n$ in the domain $\{x | x\ge \max_{\{i=1,...,n\}}(1/d_i)\}$.

NN2
  • 15,892
1

Proposition. Let $S_{(a,b)}$ be the set of symmetric real matrices $s$ of order $n$ such that the eigenvalues $(\lambda_1,\ldots,\lambda_ n)$ are in the interval $(a,b)\subset R$. Let $f:(a,b)\rightarrow R$ such that $f$ is continuously differentiable. Finally, let $\tilde{f}: S_{(a,b)}\rightarrow S=S_{R}$ defined as follows: if $e=(e_1,\ldots,e_n)$ is an orthonormal basis of diagonalization of $s$ , namely $[s]_e^e= \mathrm{diag}((\lambda_1,\ldots,\lambda_ n))$ then

$$[\tilde{f}(s)]_e^e= \mathrm{diag}(f(\lambda_1),\ldots,f(\lambda_ n)).$$ Then $\tilde{f}(s)$ is well defined in the sense that it does not depend on the particular orthonormal basis $e$. Furthermore $s\mapsto \tilde{f}(s)$ is differentiable and its differential $h\mapsto \tilde{f}'(s) (h)$ is the following linear map endomorphism of $S$ defined by \begin{equation}\label{FORMULA}[\tilde{f}'(s) (h)]_e^e=[g(\lambda_i,\lambda_j)h_{ij}]_{1\leq i,j\leq n}\end{equation} where $[h]_e^e=[h_{ij}]_{1\leq i,j\leq n}$ and $g(x,y)=\frac{f(x)-f(y)}{x-y}$ if $x\neq y$ and $g(x,x)=f'(x).$ In particular if $f(x)=\sqrt{x}$ the differential of $\tilde{f}(s)=\sqrt{s}$ is $$h\mapsto [\tilde{f}'(s)(h)]_e^e=[h_{ij}/(\sqrt{\lambda_i}+\sqrt{\lambda_j})].$$

1

Supplement.

Assume that $I + x \Sigma$ is positive definite. Assume $h \ne 0$ such that $I + (x + h) \Sigma$ is positive definite. We have $$\left[(I + (x + h) \Sigma)^{1/2} + (I + x \Sigma)^{1/2}\right]\left[(I + (x + h) \Sigma)^{1/2} - (I + x \Sigma)^{1/2}\right] = h \Sigma \tag{1}$$ where we use $(I + x \Sigma)^{1/2} (I + (x+h) \Sigma)^{1/2} = (I + (x+h) \Sigma)^{1/2} (I + x \Sigma)^{1/2}$ (using eigenvalue decomposition $U D U^\top$ of $\Sigma$, and the fact that $(U D_1 U^\top)^{1/2} = UD_1^{1/2}U^\top$ for positive definite diagonal matrix $D_1$).

From (1), we have $$\frac{1}{h} \left[(I + (x + h) \Sigma)^{1/2} - (I + x \Sigma)^{1/2}\right] = \left[(I + (x + h) \Sigma)^{1/2} + (I + x \Sigma)^{1/2}\right]^{-1}\Sigma . \tag{2}$$

Using (2), taking limit $h \to 0$, we have $$\frac{\mathrm{d} }{\mathrm{d} x} (I + x \Sigma)^{1/2} = \frac12 (I + x \Sigma)^{-1/2} \Sigma.$$

River Li
  • 37,323