Derivative of transpose of a matrix

Question

If there is some function that takes the transpose of a matrix such as $g(x) = x^t$ where $x$ is some square matrix.

What would then be the derivative of the function, $\frac{dg}{dx}$?

I think I posted an answer to this same question here two or three years ago. $\qquad$ — Michael Hardy, Mar 07 '16 at 17:14
@AnotherTest : I don't think that answers it. The question seems to be about the derivative of the transposition operator itself. $\qquad$ — Michael Hardy, Mar 07 '16 at 17:25
Here's a similar question that I answered: http://math.stackexchange.com/questions/704773/what-is-the-derivative-of-a-vector-with-respect-to-its-transpose $\qquad$ — Michael Hardy, Mar 07 '16 at 17:26
The derivative of a (continuous) linear function is the function itself. Hence $Dg(x)(h) = g(h) = h^T$. — copper.hat, Mar 07 '16 at 17:42

greg · Answer 1 · 2022-11-06T15:34:34.637

$ \def\d{\delta} \def\o{{\tt1}}\def\p{\partial} \def\H{{\cal H}} \def\LR#1{\left(#1\right)} \def\BR#1{\Big(#1\Big)} \def\op#1{\operatorname{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\qiq{\quad\implies\quad} \def\qif{\quad\iff\quad} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\cas#1{\begin{cases} #1\end{cases}} \def\c#1{\color{red}{#1}} \def\CLR#1{\c{\LR{#1}}} \def\gradLR#1#2{\LR{\grad{#1}{#2}}} $Define a fourth-order tensor $\H$ with components defined in terms of Kronecker deltas $$\eqalign{ \H_{ijkl} &= \d_{il}\d_{jk} = \cas{ \o \qquad {\rm if}\:\LR{i=l}\:{\rm and}\:\LR{j=k} \\ 0 \qquad {\rm otherwise}\\ } \\ }$$ and consider its double contraction product with an arbitrary matrix $X$ $$\eqalign{ \sum_{k=1}^n\sum_{l=1}^n \H_{ijkl} X_{kl} \;=\; \sum_{k=1}^n\sum_{l=1}^n \d_{il} \d_{jk} X_{kl} \;=\; X_{ji} \\ }$$ This is often written without the Sigmas using a double-dot product $$\eqalign{ \H:X = X^T \\ }$$ This is one way of writing your $g$ function: $\;\;G=g(X)=\H:X$

Since $\H$ is constant the differential and gradient are easy to calculate $$\eqalign{ dG &= \H:dX \qif \c{\grad GX = \H} \\ }$$ The gradient is tensor-valued, which is expected for a matrix-by-matrix gradient.

This result can also be written using index notation $$\eqalign{ \grad{G_{ij}}{X_{kl}} = \H_{ijkl} \qif \grad{X_{ji}}{X_{kl}} = \d_{jk}\d_{il} \\ }$$ An alternative to dealing with tensors is to compute a matrix-valued gradient with respect to a single component of $X$ $$\eqalign{ \grad G{X_{kl}} &= \H:\gradLR{X}{X_{kl}} &= \H:\BR{E_{kl}} = E_{kl}^T = E_{lk} \\ }$$ where $E_{lk}$ is a matrix whose elements are all zero except for the $(l,k)$ element which equals $\o$.

Derivative of transpose of a matrix

1 Answers1

Linked