0

From the definition of the Fréchet derivative and the linearity of the transpose it’s clear that the derivative of the vector transpose is the vector transpose itself.

$$f(x + h) = f(x) + D(x)h + o(\lvert\lvert h \rvert\rvert)$$

$$(x + h)^T = x^T + h^T + 0 = x^T + h^T + o(\lvert\lvert h \rvert\rvert)$$

According to the answer to question, the vector transpose is a linear transformation whose matrix representation is the identity matrix $I$, which is a constant so $D(x) = I$

But $$(x+h)^T \ne x^T + Ih + o(\lvert\lvert h \rvert\rvert)$$

What am I missing?

  • You can get proper double norm bars by using \| instead of ||. – joriki Mar 18 '24 at 04:56
  • The derivative of a vector cannot be a vector ! More importantly, are you deriving the derivative of the vector transpose w.r.t. the vector transpose or w.r.t. the vector? By the way $(x+h)^T=x^T+Ih^T + o(| h |)$ – Ted Black Mar 18 '24 at 05:17
  • Derivative of the vector transpose wrt the vector. – Tomek Dobrzynski Mar 18 '24 at 11:39
  • 1
    For the derivative of the vector transpose w.r.t. the vector you are limited by using matrix notation. Using tensor notation, if $x^i$ denotes the transpose and $x_i$ denotes the vector then $dx^i = \delta^{ij} dx_j$ and so the derivative of the vector transpose w.r.t. the vector is $\delta^{ij}$. This is the metric tensor in Euclidean space and has the unique property of mapping a vector to its transpose. – Ted Black Mar 18 '24 at 13:21

1 Answers1

0

I think I’ve realized what’s going on. The notation makes sense when the vector $h$ and the matrix $I$ are written out in terms of basis vectors.

$$ h = \alpha_1e_1 + \dots \alpha_ne_n$$

Since the transpose takes $e_i$ to $e_i^T$ and since the entries of the matrix representation of a linear transformation are the inner products between the target basis vectors and the vectors that result from the action of the transformation on the source basis vectors. $$ I = \begin{bmatrix} e_1^T \cdotp (e_1)^T & \dots & e_1^T \cdotp (e_n)^T \\ \vdots & \ddots & \vdots \\ e_n^T \cdotp (e_1)^T & \dots & e_n^T \cdotp (e_n)^T \end{bmatrix}$$

So, for the transpose $$ Ih = I(\alpha_1e_1 + \dots \alpha_ne_n) = \alpha_1e_1^T + \dots \alpha_ne_n^T = h^T$$

It’s probably less confusing to label $I$ as $I_{^T}$ to stress that the action of this matrix transposes the basis vectors.