I generalize from this question that $\nabla_x(x^TA) = \nabla_x(A^Tx)=A^T$.
However, I'm having trouble with $\nabla_{x^T}(x^TA)$. What does it mean to take the gradient of a transpose of a vector?
I generalize from this question that $\nabla_x(x^TA) = \nabla_x(A^Tx)=A^T$.
However, I'm having trouble with $\nabla_{x^T}(x^TA)$. What does it mean to take the gradient of a transpose of a vector?
There are some issues with the formula you wrote.
Well, I don't want to be all negativity. Here are a couple of properties of the derivatives w.r.t. a vector.
Say you have two column vectors $x,y\in\mathbb{R}^{n}$ and a scalar function $f$. Then the derivative $\frac{\partial f}{\partial x}$ is a row vector, and the derivative $\frac{\partial f}{\partial x^{T}}$ is a column vector.
For the scalar $x^{T}y = y^{T}x$ you have $$\frac{\partial x^{T}}{\partial x}y = \frac{\partial x^{T}y}{\partial x} = \frac{\partial y^{T}x}{\partial x} = y^{T}\frac{\partial x}{\partial x} = y^{T}$$ $$y^{T}\frac{\partial x}{\partial x^{T}} = \frac{\partial y^{T}x}{\partial x^{T}} = \frac{\partial x^{T}y}{\partial x^{T}} = \frac{\partial x^{T}}{\partial x^{T}}y = y$$
But for the derivative of a vector w.r.t. another vector there are no nice formulas except for the obvious ones. $$\frac{\partial Ax}{\partial x} = A$$ $$\frac{\partial x^{T}A}{\partial x^{T}} = A$$