I'm trying to figure out why the derivative of the solve operation is this in the chainrule package. After reading this i started writing this but i can't seems to get past what i've written.
i know that we have $$ Ax=b\\ x=A\backslash b\\ x = A^{-1}b $$
futhermore $$ \begin{align} \frac{\partial L}{\partial A_{ij}} &= \sum_k\frac{\partial L}{\partial x_k}\frac{\partial x_k}{\partial A_{ij}}\\ \end{align} $$
where $L$ is the loss(a function of x). Let $c_k$ be $\frac{\partial L}{\partial x_k}$
$$ \begin{align} \frac{\partial L}{\partial A_{ij}} &= \sum_k c_k\frac{\partial (A^{-1}b)_k}{\partial A_{ij}}\\ &= c^\top \frac{\partial A^{-1}b}{\partial A_{ij}}\\ &= c^\top \left(\frac{\partial A^{-1}}{\partial A_{ij}} b + A^{-1}\frac{\partial b}{\partial A_{ij}}\right) \\ &= c^\top \left(-A^{-1}\frac{\partial A}{\partial A_{ij}}A^{-1} b + A^{-1}\frac{\partial b}{\partial A_{ij}}\right) \\ &= c^\top \left(-A^{-1}1_{ij}x + A^{-1}\frac{\partial b}{\partial A_{ij}}\right) \\ &= c^\top \left(A^{-1}(\frac{\partial b}{\partial A_{ij}} - 1_{ij}x)\right) \\ \end{align} $$ where $1_ij$ is matrix whose $ij$ th cell is one(the others are zero).
The thing i'm missing here is how can we calculate $\frac{\partial b}{\partial A_{ij}}$ in the backprob since we can't know it.
--edit--
for the sake of completeness Chainrule.jl uses
$$
\partial A = -(A') ^{-1} c x'\\
\partial b = A' \ c
$$
where $'$ is the conjugate transpose