2

I wonder if there is any formula of $\frac{d tr(XAX)}{d X}$ when $X$ is symmetric?


I understand that when $X$ is not symmetric, we have $$\frac{d tr(XAX^T)}{dX} = XA^T+XA,$$ and when $X$ is symmetric, we have $$\frac{d tr(AX)}{dX} = A+A^T-A\circ I,$$ where $\circ$ means elementwise product.

However, I am not able to find $\frac{d tr(XAX)}{d X}$ when $X$ is given to be symmetric. Or any hint on the derivative of this formula is appreciated.

Tan
  • 621
  • You title and question body are inconsistent: in the title you mention trace, but there is no trace anywhere in your formulas? – user7530 Sep 09 '21 at 19:09
  • @user7530 Thank you! – Tan Sep 09 '21 at 19:10
  • 1
    This post might be helpful. In particular, several of the answers and comments refer to a paper by Panda which is worth reading. – greg Sep 09 '21 at 22:25
  • If you've read the Panda paper, then you know that you can take the gradient for the non-symmetric $\big(X\ne X^T\big)$ case, $\big({\rm i.e.;}G=XA^T+XA\big);$ and symmetrize it to obtain the gradient for the symmetric $\big(X=X^T\big)$ case: $;G = \frac 12\big(XA^T+XA + AX+A^TX \big);$ – greg Sep 10 '21 at 22:02

1 Answers1

2

For matrix derivatives I find it far easier to think about differentials. If we write $f(X) = \operatorname{tr}(XAX)$ and let $\delta X$ be an arbitrary variation of $X$, then $$df(X)\delta X = \operatorname{tr}(\delta X A X + XA\delta X)$$ from the product rule and linearity of trace. Then we can use the invariance of trace under cyclic permutations to write the above as $$df(X)\delta X = \operatorname{tr}(A X \delta X + X A \delta X)$$ and finally since $\operatorname{tr}(AB) = A^T : B$, where $:$ is the Frobenius product, $$df(X)\delta X = (XA^T + A^TX) : \delta X.$$ Now the notion of a "derivative" of a function with respect to a matrix does not have an entirely standard definition, but if we take it to mean "the matrix whose Frobenius product with $\delta X$ gives you the directional derivative in the $\delta X$ direction" (which I assume is the case from your examples in the OP) then we have our answer: the derivative of $f$ is $$XA^T + A^TX.$$

Important but subtle note: this is the derivative of $\operatorname{tr}(XAX)$ at a symmetric matrix $X$. It is not the same as the derivative of $\operatorname{tr}(X^TAX)$ (even if $X$ is assumed symmetric), although the two functions have the same directional derivative in symmetric directions.

user7530
  • 49,280
  • So it has the same solution with $\frac{dtr(XXA)}{dX}$, when $X$ is not assumed symmetric in the derivation? The solution is $\frac{dtr(XXA)}{dX} = A^TX^T + X^TA^T$. – Tan Sep 10 '21 at 15:55
  • @tan trace is invariant under cyclic permutations, so yes. – user7530 Sep 10 '21 at 21:22