0

Prove that $\nabla_X tr(X^TAX)= (A+A^T)X $ where $A \in \mathbb{C^{m*m}}$ and $X \in \mathbb{R^{m*n}} $ .

1.) Same proof stands when $ A\in \mathbb{C}$ or $ A\in \mathbb{R}$ ?

2.) What is the simplest way to prove this?

Wanderer
  • 927
  • @Alex M. yes!! sorry! – Wanderer Apr 04 '15 at 20:28
  • Also, question 1 is not clear at all. Could you reformulate it, please? – Alex M. Apr 04 '15 at 20:29
  • @AlexM.better now? – Wanderer Apr 04 '15 at 20:42
  • You've got a trace on the left hand side that is absent in the right hand side. As it is right now, the statement is false. – Alex M. Apr 04 '15 at 20:54
  • I am not sure but there is also a proof here. http://math.stackexchange.com/questions/482742/how-to-calculate-gradient-of-xtax

    The only reason I re-asked because in my problem A is complex and not real.

    – Wanderer Apr 04 '15 at 21:00
  • It doesn't matter whether it's complex or not, the reasoning and formulae do not change. I was saying that you've got a $tr$ in the left hand side; this means that the left hand side must be a number, while the right hand side is a matrix. Either remove the $tr$, or add it in the right hand side. No matter which one you'll do, you'll get a correct result (but the two results will have different meanings - one will be about traces, the other about products of matrices). – Alex M. Apr 04 '15 at 21:30
  • I am confused. So a derivative of a trace can not be a matrix? There are so many identities like that! – Wanderer Apr 04 '15 at 21:58

2 Answers2

0

perhaps this example can help:

$A_{m,m} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22}\end{bmatrix}$ $X_{m,n} = \begin{bmatrix} x_{11} & x_{12} & x_{22}\\ x_{21} & x_{22} & x_{23}\end{bmatrix}$

$tr(X^TAX) = (x_{11}a_{11}+x_{21}a_{21})x_{11}+(x_{11}a_{12}+x_{21}a_{22})x_{21}+(x_{12}a_{11}+x_{22}a_{21})x_{12}+(x_{12}a_{12}+x_{22}a_{22})x_{22}+(x_{13}a_{11}+x_{23}a_{21})x_{13}+(x_{13}a_{12}+x_{23}a_{22})x_{23}$

Partial derivative:

$\nabla_X tr(X^TAX) = \begin{bmatrix} 2x_{11}a_{11}+(a_{12}+a_{21})x_{21} & 2x_{12}a_{11}+(a_{12}+a_{21})x_{22} & 2x_{13}a_{11}+(a_{12}+a_{21})x_{23}\\ 2x_{21}a_{22} +(a_{12}+a_{21})x_{11} & 2x_{22}a_{22} + (a_{12}+a_{21})x_{12} & 2x_{23}a_{22} + (a_{12}+a_{21})x_{13} \end{bmatrix}$

Second part: $(A + A^T)X$

$A + A^T = \begin{bmatrix} 2a_{11} & a_{12}+a_{21} \\ a_{12}+a_{21} & 2a_{22}\end{bmatrix}$

$(A + A^T)X = \begin{bmatrix} 2x_{11}a_{11}+(a_{12}+a_{21})x_{21} & 2x_{12}a_{11}+(a_{12}+a_{21})x_{22} & 2x_{13}a_{11}+(a_{12}+a_{21})x_{23}\\ 2x_{21}a_{22} +(a_{12}+a_{21})x_{11} & 2x_{22}a_{22} + (a_{12}+a_{21})x_{12} & 2x_{23}a_{22} + (a_{12}+a_{21})x_{13} \end{bmatrix}$

guille_NP
  • 106
0

As a first step let's express the function in terms of the Frobenius product and find its differential $$\eqalign { f &= {\rm tr}(X^TAX) \cr &= AX:X \cr &= A:XX^T \cr \cr df &= A:(dX\,X^T + X\,dX^T) \cr &= (A:dX\,X^T) + (A:X\,dX^T) \cr &= (AX:dX) + (X^TA:dX^T) \cr &= (AX:dX) + (A^TX:dX) \cr &= (A + A^T)X:dX \cr }$$ Since $df = \frac {\partial f} {\partial X}:dX$ you can identify the derivative in the previous line as $$ \eqalign { \frac {\partial f} {\partial X} &= (A + A^T)X \cr }$$ It doesn't matter if the elements of $A$ are complex or real, since the derivation above only makes use of transpositions, not hermitian conjugations.

greg
  • 688