1

Let's say $a$ and $b$ are $n$ by $1$ vectors and $A(x)$ is an $n$ by $n$ matrix that depends on scalar parameter $x$. I would like to take derivative of $$\frac{da^TA(x)^{-1}b}{dx}.$$

I know that $$\frac{da^TA^{-1}b}{dA} = - A^{-T}ab^T A^{-T}$$ from here.

But I could not derive the result I want.

user1292919
  • 1,895

2 Answers2

2

$ \def\A{A^{-1}} \def\At{A^{-T}} \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\frob#1{\left\| #1 \right\|_F} \def\qiq{\quad\implies\quad} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\Dx#1{\frac{d #1}{dx}} \def\DxLR#1{\LR{\Dx{#1}}} $Take the known gradient of the function $\:\phi = {a^T\A b}$ $$\eqalign{ \grad{\phi}{A} &= -\At ab^T\At \qquad\qquad\quad \\ }$$ and rewrite it as a differential (using a double-dot product) $$\eqalign{ d\phi &= -\LR{\At ab^T\At}: dA \quad\; \\ }$$ then as a derivative with respect to the scalar $x$ $$\eqalign{ \Dx\phi &= -\LR{\At ab^T\At}: \DxLR A \\ }$$ then rearrange it into a simpler form $$\eqalign{ \Dx\phi &= -a^T\A\DxLR A\A b \qquad \\ }$$

greg
  • 35,825
1

Such single variable derivatives can be easily computed from first principles using dual numbers as an automatic differentiation tool. Write $$\frac{\partial A}{\partial x} = A’$$ so the dual expansion of $A$ is $A + A’ \varepsilon$. Here $\varepsilon$ can be treated as a scalar for matrix- and vector multiplication.

To find $(A + A’ \varepsilon)^{-1}$ solve $$1 = (A + A’ \varepsilon)(A^{-1} + B \varepsilon) = 1 + (A B + A’ A^{-1}) \varepsilon$$ for $B$. This gives $$(A + A’ \varepsilon)^{-1} = A^{-1} - A^{-1} A’ A^{-1} \varepsilon.$$

Now $a$ and $b$ are constant in $x$ so the dual expansion of your function is $$a^{\mathrm t}(A^{-1} - A^{-1} A’ A^{-1} \varepsilon) b = a^{\mathrm t} A^{-1} b - a^{\mathrm t} A^{-1} A’ A^{-1} b \varepsilon.$$

The $\varepsilon$ coefficient in this expression gives the derivative $$- a^{\mathrm t} A^{-1} A’ A^{-1} b.$$

WimC
  • 32,192
  • 2
  • 48
  • 88