what is the derivative of \begin{equation}\partial \frac{x^TVx}{\partial V} \end{equation} where V is a matrix and x is a vector. In general what is the right way to calculate matrix derivatives w.r.t other matrices or vectors?
Asked
Active
Viewed 2,191 times
-1
-
Isn't $x^T V x$ a scalar? – GFauxPas Apr 01 '15 at 19:55
-
yes. it is a single number – Morteza Shahriari Nia Apr 01 '15 at 19:57
2 Answers
2
You want to differentiate a scalar quantity $x^TVx$ with respect to matrix $V$, so that the derivative will be a matrix with the same dimension as $V$.
Now, $x^TVx$ is equal to $Trace(Vxx^T)$, so using standard results of the derivative of the trace of a matrix product, see page $3$ here, the result is $$\frac{\partial x^TVx}{\partial V}=\frac{\partial Trace(Vxx^T)}{\partial V}=xx^T$$

Alijah Ahmed
- 11,609
-
-
1The scalar product is $x^TVx=\sum_{i=1}^n\sum_{j=1}^nx_ix_jv_{i,j}$, where $n$ is the dimension of the vector and matrix. As for the matrix product, the $ith$ diagonal element of $Vxx^T$ will be $\sum_{j=1}^nv_{i,j}x_ix_j$, so the trace, which is the sum of the $n$ diagonal elements, will be $\sum_{i=1}^n\sum_{i=1}^nv_{i,j}x_ix_j$, which is $x^TVx$. – Alijah Ahmed Apr 01 '15 at 20:34
-
1In general, this is due to the "cyclic property of the trace", that $Tr(AB) = Tr(BA)$ for two matrices $A,B$ such that both $AB$ and $BA$ are well-defined. – Joshua Mundinger Apr 01 '15 at 22:28
-
Ok, so I got the trace part. How did you know the result of the last step? Derivative of the trace of the product w.r.t V. Is that a rule? or you just calculate element by element and later on observe that. I'm trying to see if these are well known facts (like there is a look up table for it) or you just derive them? – Morteza Shahriari Nia Apr 01 '15 at 23:47
-
@MortezaShahriariNia The derivative of a trace of the product w.r.t V is a rule that can be easily proven by calculating element by element and differentiating. In terms of matrix differentiation, these are well known facts, and there are indeed tables that list these results. – Alijah Ahmed Apr 02 '15 at 08:55
1
alternately, recall that a derivative is the best approximating linear map. Also the map $$ V \mapsto x^{t}Vx $$ is linear in $V$ so we can expect to get what we started with.
Consider a small perturbation $E$ $$ x^{t} (V+E) x^t =x^t V x + x^t E x $$ So the best approximation for local behaviour around $V$ is the map $$ E \mapsto x^{t} E x. $$

Mark Joshi
- 5,604