1

I have a function $\Gamma(x): \mathbb{R}^{n} \to \mathbb{R}^{2n \times m}$ defined as $$ \Gamma(x) := \begin{bmatrix} X_{11}(x) & X_{12}(x) \\ X_{21}(x) & X_{22}(x) \end{bmatrix}^{-1} B $$ where $B \in \mathbb{R}^{2n \times m}$ is constant and $X_{ij}(x) \in \mathbb{R}^{n \times n}$. How can I find $\Gamma^{\prime}(x)$? I'm trying to use the chain rule and Cramer's rule for the adjugate but am stuck.

In particular, I define $$ X(x) := \begin{bmatrix} X_{11}(x) & X_{12}(x) \\ X_{21}(x) & X_{22}(x) \end{bmatrix} $$ so that $$\frac{{\rm d}}{{\rm d}X}\Gamma = -(X^{-1}B)^{\top} \otimes X^{-1}$$ but I am confused about pushing the chain rule through for $$\frac{{\rm d}X}{{\rm d}x}.$$

What I'm trying to do is something like: \begin{align} \frac{{\rm d}}{{\rm d} x} \Gamma &= \frac{{\rm d}\Gamma}{{\rm d}X} \cdot \frac{{\rm d}X}{{\rm d}x} \\ &= \left( -(X^{-1}B)^{\top} \otimes X^{-1} \right) \cdot \frac{{\rm d}X}{{\rm d}x} \end{align} but am unsure of what sort of product "$\cdot$" denotes here. Differentiatiating entry-by-entry of $x$ as $$ {\rm vec} \left( \frac{{\rm d}\Gamma}{{\rm d}x_i}(x) \right) = (B^{\top} \otimes I_n) \cdot {\rm vec}\left( X^{-1}(x) \cdot \frac{\partial}{\partial x_i}X(x) \cdot X^{-1}(x) \right) $$ and then stacking the resulting vectors horizontally to gives $\frac{{\rm d}}{{\rm d} x} \Gamma$, but I'm not sure how to represent this object more compactly via tensors / Krons / Frobenius inner products. Is $$ \frac{{\rm d}}{{\rm d} x} \Gamma = \left( -(X^{-1}B)^{\top} \otimes X^{-1} \right) : \frac{{\rm d}X}{{\rm d}x} $$ right? Would it be easier to use tensor notation?

jjjjjj
  • 2,671

2 Answers2

1

If you consider the derivative in vector form it becomes clearer. $$ \mathrm{vec}(\frac{d}{dx}\Gamma)=\mathrm{vec}(\frac{\partial}{\partial x} (X^{-1} B)) = -(X^{-1}B)^\top \otimes X^{-1} \mathrm{vec}(\Gamma'(x)) $$

Now using $$(A\otimes B)\mathrm{vec}(C) = \mathrm{vec}(BCA^T)$$ we get

$$ \mathrm{vec}(\frac{\partial}{\partial x} (\Gamma B)) = - \mathrm{X^{-1}\Gamma'(x)X^{-1}}B $$

Peder
  • 2,182
  • thanks, and am I correct that this also assumes $x$ is scalar? – jjjjjj May 02 '19 at 20:24
  • If x is not scalar the result will no longer be a matrix. If x is a matrix then the derivative $\Gamma'(x)$ would have the same size as your kronecker product and you would have to vec the Kronecker product as well to use the chain rule. – Peder May 02 '19 at 21:42
  • Btw, is $Y$ in your second display $C$? – jjjjjj May 02 '19 at 21:44
  • Would you mind explaining vec-ing the Kronecker product and using the chain rule in more detail in the case where $x$ is a vector? – jjjjjj May 02 '19 at 22:33
  • 1
    Sorry Y should have been C. I made a correction. – Peder May 03 '19 at 03:15
  • Why do you start by computing $\mathrm{vec}(\frac{\partial}{\partial X} (\Gamma B))$? Don't we want $\mathrm{vec}(\frac{\partial}{\partial x}(\Gamma))$? – jjjjjj May 03 '19 at 03:58
  • More errors on my part. I corrected it now. – Peder May 03 '19 at 04:35
  • But I think you compute derivative wrt $X$? Whereas I'm seeking wrt $x$ – jjjjjj May 03 '19 at 14:37
  • That's another type. I wrote d/dX, but calculated d/dx – Peder May 09 '19 at 14:29
0

For ease of typing, use a dot to denote the derivative wrt the scalar parameter, e.g. $$\dot F = \frac{dF}{dx}$$ Now consider the derivative of a matrix inverse. $$\eqalign{ I &= XY \quad&\implies\quad Y = X^{-1} \\ \dot I &= \dot XY + X\dot Y \\ 0 &= \dot XY + X\dot Y \quad&\implies\quad \dot Y = -Y\dot XY \\ }$$ Applying this to the current problem yields $$\eqalign{ \Gamma &= X^{-1}B = YB \\ \dot\Gamma &= \dot YB = -Y\dot XYB = -X^{-1}\dot X\,\Gamma \\ }$$

greg
  • 35,825