I'm having trouble understanding this: if $\phi(x)=x^{T}Ax$, where $A$ is a matrix of constraints. Then, the differential of $\phi$: $\mathrm{d} \phi=(\mathrm{d} x)^{T} A x+x^{T} A \mathrm{~d} x=x^{T} A^{T} \mathrm{d} x+x^{T} A \mathrm{~d} x=x^{T}\left(A+A^{T}\right) \mathrm{d} x$.
I'm having trouble understanding this step: why is $(\mathrm{d} x)^{T} A x=x^{T} A^{T} \mathrm{d} x$?
I find it interesting because based on the rule: $$ (AB)^{T}=B^{T}A^{T} $$ we have $$ (\mathrm{d} x)^{T} (A x)=[(Ax)^{T} \mathrm{d} x]^{T} $$ Therefore $(\mathrm{d} x)^{T} A x=x^{T} A^{T} \mathrm{d} x$ essentially is saying that $[(Ax)^{T} \mathrm{d} x]^{T}=(Ax)^{T} \mathrm{d} x$
Showing me why this is the case can help me understand this question a lot.
I do know thispost, but I don't understand why we need chain rule here, because the chain rule is: Let $x=x(t)$ and $y=y(t)$ be differentiable at $t$ and suppose that $z=f(x, y)$ is differentiable at the point $(x(t), y(t))$. Then $z=f(x(t), y(t))$ is differentiable at $t$ and $$ \frac{d z}{d t}=\frac{\partial z}{\partial x} \frac{d x}{d t}+\frac{\partial z}{\partial y} \frac{d y}{d t} $$
there's no involvement of another variable t at a lower-level than x & y. There is just one level x here.
Thank you all in advance for your answers.