11

Could someone please explain how to take the derivative of matrix with respect to itself?

$$\frac{\partial \textbf{X}}{\partial \textbf{X}}$$

where $\textbf{X}$ is an M x N matrix

3 Answers3

6

$\def\p#1#2{\frac{\partial #1}{\partial #2}}$A matrix/matrix gradient produces a 4th order tensor, which is easily evaluated in index notation $$\eqalign{ {\cal E} &= \p{X}{X} \quad\implies\quad {\cal E}_{ijk\ell} &= \p{X_{ij}}{X_{k\ell}} &= \delta_{ik}\delta_{j\ell} \\ }$$ where the Kronecker delta symbol is defined as $$\eqalign{ \delta_{ik} &= \begin{cases} {\tt1}\quad {\rm if}\;i=k \\ 0\quad {\rm otherwise} \end{cases} }$$ In words:   If $X_{ij}$ and $X_{k\ell}$ refer to different elements then the derivative is $0$, otherwise it's $\tt1$.

This is analogous to the vector/vector derivative which produces the identity matrix $$\eqalign{ I &= \p{x}{x} \quad\implies\quad {I}_{ij} &= \p{x_i}{x_j} &= \delta_{ij} \\ }$$

greg
  • 35,825
2

The underlying mapping is $$ f(X)=X, $$ the identity mapping on the vector space $V$ of all matrices. It is linear, hence its derivative at $X$ in direction $\delta X$ is $$ f'(X)\delta X=\delta X, $$ which is $$ f'(X) = f. $$ Note, that both $f$ and $f'(X)$ are linear mappings from $V$ to $V$. The mapping $f'$ is a mapping from $V$ to $L(V,V)$.

daw
  • 49,113
  • 2
  • 38
  • 76
0

The answer is a lot easier than the previous posters are indicating $\frac{d}{dX}(A*X)=A$ define $A=I_m$ where $m$ is the number of rows of $X$ because $I_m*X=X$ this is an identity and you get your derivative = $I_m$

Keith
  • 29