How can I find the gradient
$$\nabla_{u} \left(x^T \left(A \, \mbox{diag}(u)\, A^T \right)^{-1} x \right)$$
where $x \in \mathbb{R}^n$, $u \in \mathbb{R}^d$, are vectors and $A \in \mathbb{R}^{n \times d}$ is a matrix?
I referred to The Matrix Cookbook but I cannot find a standard formula there for this expression, and also don't know how to apply some kind of matrix chain rule to compute the derivative in multiple steps. I'd appreciate any pointers.
What I tried so far
Let $s=x^T M^{-1} x$, where $M=A \, diag(u)\, A^T$. Then $$\frac{\partial s}{\partial u} = \frac{\partial s}{\partial M^{-1}} \cdot \frac{\partial M^{-1}}{\partial M} \cdot \frac{\partial M}{\partial u}$$
From the matrix cookbook, I get $\frac{\partial s}{\partial M^{-1}} = xx^T$, $\frac{\partial M^{-1}}{\partial M} = -M^{-1}M^{-1}$, but I don't know how to get $\frac{\partial M}{\partial u}$, and how exactly to combine the above expressions - do I simply matrix multiply them?