1

I have an expression involving the inverse of a symmetric matrix, which I want to differentiate with respect to one of the elements of that matrix. I've been working through the problem with help from the matrix cookbook (https://www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf) but am unsure if I've correctly worked through - I'd be glad of advice as to whether the below is correct, that I have correctly adjusted for the fact that the matrix is symmetric.

(This is for some algebra I am working out in order to implement the move step of an MCMC algorithm)

I have an expression $\mathbf{A}\mathbf{B}^{-1}$, where $\mathbf{A}$ and $\mathbf{B}$ are both square matrices, and $\mathbf{B}$ is symmetric. I want to evaluate the expression:

$\frac{d\mathbf{A}\mathbf{B}^{-1}}{dB_{gh}}$

Where $B_{gh}$ is the elementof matrix $\mathbf{B}$ in the $g$th row, $h$th colummn.

So far I have:

$\frac{d\mathbf{A}\mathbf{B}^{-1}}{dB_{gh}} = \{\frac{d\mathbf{A}}{dB_{gh}}\mathbf{B}^{-1}\}+\{\mathbf{A}\frac{d\mathbf{B}^{-1}}{dB_{gh}}\} $

With $\frac{d\mathbf{A}}{dB_{gh}}=\mathbf{0}$ as $\mathbf{A}$ doesn't contain any elements of $\mathbf{B}$.

Then, that

$\frac{d\mathbf{B}^{-1}}{dB_{gh}}= -\mathbf{B}^{-1}\frac{d\mathbf{B}}{dB_{gh}}\mathbf{B}^{-1}$

Then that as $\mathbf{B}$ is symmetric we have:

$\frac{d\mathbf{B}}{dB_{gh}}=Tr\{\{\frac{d\mathbf{B}}{d\mathbf{B}}\}^T \frac{d\mathbf{B}}{dB_{gh}}\}$

As $\mathbf{B}$ is symmetric, $\frac{d\mathbf{B}}{dB_{gh}}$ evaluates to $\mathbf{S}^{gh}$, where:

$\mathbf{S}^{gh}=\mathbf{J}^{gh}+\mathbf{J}^{hg}-\mathbf{J}^{gh}\mathbf{J}^{gh}$

Where $\mathbf{J}^{gh}$ is a matrix with a $1$ in the $g$th row and $h$th column, with $0$ elsewhere.

However, from this question (How to take the derivative of a matrix with respect to itself?), am I correct in understanding that $\frac{d\mathbf{B}}{d\mathbf{B}}$ evaluates to $\mathbf{B}$?

Which would leave the solution to be the following?

$\frac{d\mathbf{A}\mathbf{B}^{-1}}{dB_{gh}}=-\mathbf{B}^{-1}\{\mathbf{B}\mathbf{S^{gh}}\}\mathbf{B}^{-1}$

mes
  • 91

1 Answers1

1

$\def\v{{\rm vec}}\def\p#1#2{\frac{\partial #1}{\partial #2}}\def\E{{\cal E}}$Let $F=AB^{-1}\,$ denote the matrix-valued function and calculate its differential. $$\eqalign{ dF &= A\,dB^{-1} \\ &= AB^{-1}\,dB\,B^{-1} \\ &= F\,dB\,B^{-1} \\ }$$ At this point you have several choices.

If you are comfortable with tensors, then the fourth order identity tensor $$\E = \p{B}{B} \quad\implies\quad \E_{ijk\ell} = \delta_{ik}\delta_{j\ell}$$ can be used rearrange the differential and extract the tensor-valued gradient $$\eqalign{ dF &= F\E B^{-T}:dB \\ \p{F}{B} &= F\E B^{-T} \;\;\in\,{\mathbb R}^{n\times n\times n\times n} \\ }$$ You can also use a Kronecker product to flatten the matrices into vectors and extract a matrix-valued gradient $$\eqalign{ \v(dF) &= (B^{-T}\otimes F)\;\v(dB) \\ \p{\,\v(F)}{\,\v(B)} &= B^{-T}\otimes F \;\;\in\,{\mathbb R}^{n^2\times n^2} \\ }$$ $\big[$NB: These first two approaches product identical components, but in different shapes.$\big]$

You could also define a set of indexed (single-entry) matrices by $$\eqalign{ J_{ik} &= \p{B}{B_{ik}} \;=\; e_ie_k^T \\ }$$ where {$e_k$} are the standard basis vectors. Then substitute this directly into the differential expression to obtain $$\eqalign{ \p{F}{B_{ik}} &= FJ_{ik}B^{-1} \;\;\in\,{\mathbb R}^{n\times n} \\ }$$ As for the symmetry constraint, please read this post. Afterward, if you still feel compelled to pursue the idea, then one common (but very misleading) interpretation is $$\eqalign{ S_{ik} &= \p{B}{B_{ik}} \;=\; e_ie_k^T + e_ke_i^T - I\odot e_ie_k^T \\ \p{F}{B_{ik}} &= FS_{ik}B^{-1} \;\;\in\;{\mathbb R}^{n\times n} \\ }$$ where $\odot$ is the elementwise/Hadamard product and $I$ is the identity matrix.

greg
  • 35,825