1

I am trying to calculate the following derivative, involving $X$ and $Y$ matrices:

$$ \frac{\partial}{\partial X}X^TY^TYX $$

I have tried something similar to the approach in Vector derivation of $x^Tx$

and end up with something like:

$$ X^TY^TYX + \mathbf{X^TY^TYZ + Z^TY^TYX + Z^TY^TYZ} $$

I don't really know if I can consider $Z=I$ as in the link above.

Is there an easy way of calculating this derivative?

Something easier than calculating the per-entry derivative of the final matrix and reconstruct it using matrix operations.

JC1
  • 125
  • Your result derivative is a fourth-rank tensor. It is hard to find a representation using standard matrices. And no, $Z$ is not $I$. – N74 Jan 13 '16 at 08:57

2 Answers2

1

I find it the most easy to solve this kind of questions explicitly in components. Let us agree on the Einstein summation convention that indices which appear double are summed over. Then $$(X^T Y^T Y X)_{ij} = (X^T)_{ik} (Y^T)_{kl} Y_{lm} X_{mj} =X_{ki} Y_{lk} Y_{lm} X_{mj}. $$ The elementary rule of taking the derivative with respect to the matrix $X_{ab}$ is $$\frac{\partial}{\partial X_{ab}} X_{ij} = \delta_{ai} \delta_{bj}$$ with $\delta$ the Kronecker symbol.

So we have $$\frac{\partial}{\partial X_{ab}} (X^T Y^T Y X)_{ij} =\frac{\partial}{\partial X_{ab}} X_{ki} Y_{lk} Y_{lm} X_{mj} = \delta_{ak} \delta_{bi} Y_{lk} Y_{lm} X_{mj} + \delta_{am} \delta_{bj} X_{ki} Y_{lk} Y_{lm} \\ = \delta_{bi} Y_{la} Y_{lm} X_{mj}+ \delta_{bj} X_{ki} Y_{lk} Y_{la}.$$

Written more compactly, we have $$\frac{\partial}{\partial X_{ab}} (X^T Y^T Y X)_{ij} = (Y^T Y X)_{aj} \delta_{bi} + (X^T Y^T Y)_{ia} \delta_{bj}. $$

Fabian
  • 23,360
1

Let be $A = Y^TY$ and $H$ a "small" matrix: $$ (X+H)^TA(X+H) - X^TAX = (X^T+H^T)A(X+H) - X^TAX = (X^TAH + H^TAX) + H^TAH $$ The linear part is the term $X^TAH + H^TAX = X^TY^TYH + H^TY^TYX$. So, the differential of $$X\mapsto X^TY^TYX$$ at $X = X_0$ is the linear function $$H\mapsto X_0^TY^TYH + H^TY^TYX_0.$$