What is the matrix representation of the derivative of the vector transpose

Question

From the definition of the Fréchet derivative and the linearity of the transpose it’s clear that the derivative of the vector transpose is the vector transpose itself.

$$f(x + h) = f(x) + D(x)h + o(\lvert\lvert h \rvert\rvert)$$

$$(x + h)^T = x^T + h^T + 0 = x^T + h^T + o(\lvert\lvert h \rvert\rvert)$$

According to the answer to question, the vector transpose is a linear transformation whose matrix representation is the identity matrix $I$, which is a constant so $D(x) = I$

But $$(x+h)^T \ne x^T + Ih + o(\lvert\lvert h \rvert\rvert)$$

What am I missing?

You can get proper double norm bars by using \| instead of ||. — joriki, Mar 18 '24 at 04:56
The derivative of a vector cannot be a vector ! More importantly, are you deriving the derivative of the vector transpose w.r.t. the vector transpose or w.r.t. the vector? By the way $(x+h)^T=x^T+Ih^T + o(| h |)$ — Ted Black, Mar 18 '24 at 05:17
For the derivative of the vector transpose w.r.t. the vector you are limited by using matrix notation. Using tensor notation, if $x^i$ denotes the transpose and $x_i$ denotes the vector then $dx^i = \delta^{ij} dx_j$ and so the derivative of the vector transpose w.r.t. the vector is $\delta^{ij}$. This is the metric tensor in Euclidean space and has the unique property of mapping a vector to its transpose. — Ted Black, Mar 18 '24 at 13:21

Tomek Dobrzynski · Answer 1 · 2024-03-18T13:04:29.003

I think I’ve realized what’s going on. The notation makes sense when the vector $h$ and the matrix $I$ are written out in terms of basis vectors.

$$ h = \alpha_1e_1 + \dots \alpha_ne_n$$

Since the transpose takes $e_i$ to $e_i^T$ and since the entries of the matrix representation of a linear transformation are the inner products between the target basis vectors and the vectors that result from the action of the transformation on the source basis vectors. $$ I = \begin{bmatrix} e_1^T \cdotp (e_1)^T & \dots & e_1^T \cdotp (e_n)^T \\ \vdots & \ddots & \vdots \\ e_n^T \cdotp (e_1)^T & \dots & e_n^T \cdotp (e_n)^T \end{bmatrix}$$

So, for the transpose $$ Ih = I(\alpha_1e_1 + \dots \alpha_ne_n) = \alpha_1e_1^T + \dots \alpha_ne_n^T = h^T$$

It’s probably less confusing to label $I$ as $I_{^T}$ to stress that the action of this matrix transposes the basis vectors.

What is the matrix representation of the derivative of the vector transpose

1 Answers1