7

I've already looked at Vector derivative w.r.t its transpose $\frac{d(Ax)}{d(x^T)}$, but I wasn't able to find the direct answer to my question in that question. What is the value of $$\frac{d}{dx} x^T\text{ ?}$$ My initial intuition is that it is $1$, but I'm not exactly sure of why that would be so.

3 Answers3

17

What sort of object can be the derivative of a vector-valued function whose values are row vectors and whose arguments are column vectors? Generally, what kind of object can be the derivative of a function whose values are members of one vector space $W$ and whose arguments are members of another vector space $V$?

$$ f: V\to W $$

The answer is that the value of such a derivative at any point in $V$ is a linear transformation from $V$ into $W$, and it may be a different linear transformation at each point in $V$. But if $f$ is itself linear, then it's the same linear transformation at each point in $V$: it's $f$ itself.

Transposition is linear. Therefore the value of its derivative at each point in its domain is itself.

Often one represents a linear transformation by a matrix. What would be the matrix in this case? No matter what basis you pick for the domain $V$, it seems natural to pick as a basis of $W$ the set of transposes of the basis vectors you chose for $V$. In that case, the matrix would be the identity matrix.

6

The answer is relatively straightforward. Without going into the detail, let me tell the answer first: $I_{N\times N}$.

The derivative of a vector valued function with respect to a vector is called Jacobian $J$ \begin{equation} \textbf{J}:=\left(\frac{d \textbf{f}(\textbf{x})}{d\textbf{x}}\right)_{ij} = \frac{d\textbf{f}_i(\textbf{x})}{d\textbf{x}_j}. \end{equation}

For the question of interest, $\textbf{f}(\textbf{x})\rightarrow\textbf{x}^T$. The remaining task is use the property: \begin{equation} \frac{d\textbf{x}_i}{d\textbf{x}_j}=\delta_{ij}=I_{N\times N}. \end{equation}

-3

That depends on how you define vector derivative. There are generally two ways. One is applying abstract index notation, then $$\frac{d}{dx}x^T=\left(\frac{dx_i}{dx^j}\right)=(\delta_{ij})=(e_1\otimes\cdots\otimes e_n)^T$$ where $e_i$s are unit vector whose $i$ th component is one and zero otherwise.

Another way to look at it is to regard as directional derivative, then $$\frac{d}{dx}x^T=\lim_{h\to0}\frac{(x+hx)^T-x^T}{h}=x^T$$

Shuchang
  • 9,800
  • 1
    Who gave a point to this answer? @Shuchang, your first answer is false: indeed $e_1\otimes \cdots \otimes e_n$ is a vector in $\mathbb{C}^{n^n}$. In fact $\delta_{i,j}=I_n$ ; cf. the last line of the answer of Michael. Your second answer is false because you consider only the derivative in the direction of $x$. You should write, for every vector $y$, $\lim\dfrac{(x+hy)^T-x^T}{h}=y^T$. –  Mar 11 '14 at 20:30
  • @loupblanc 1. $\delta_{ij}$ is not the identity, otherwise it is expected to have $x=Ix=(\delta_{ij}x^j)=(x_i)=x^T$. Hardy's answer gives another explanation of vector derivative. 2. Vector $y$ is not specified and we can only consider that the derivative is taken along $x$ direction. – Shuchang Mar 12 '14 at 01:02
  • Firstly, let $f:x\in\mathbb{R}^n \rightarrow x^T$. Then $D_xf:k\in\mathbb{R}^n\rightarrow k^T$. We choose the bases $e_1,\cdots,e_n$, the canonical basis, and $e_1^T,\cdots,e_n^T$. The matrix associated to $D_xf$ is $I_n$. Secondly, your so called "not specified $y$" plays the role of the vector $k$ above. Obviously $k$ is not necessarily parallel to $x$. –  Mar 13 '14 at 02:28
  • EDIT: of course $Ik=k^T$ because, by definition, $I(\sum k_ie_i)=\sum k_ie_i^T$. –  Mar 13 '14 at 02:40