Let $f: \mathbb{R}^n \rightarrow \mathbb{R}$ and $f(x) = x^T A x$. I will denote $\nabla_x$ or $\nabla$ as the gradient to some vector-valued variable and $\nabla^2$ or $H$ as the Hessian.
The lecturer postulated that $\nabla f(x) = 2 A x$, and that $\nabla^2 f(x) = 2A$.
It's not immediately clear to me that this is true. What I thought of, was that $\nabla f(x)$ always yields a column vector, and that therefore $\nabla f(x) = 2 A x$. But this feels more like a trick (to remember it) and not like a proof to me.
How does one derive $\nabla_x x^T A x = 2 A x$? Why can't it be $2 (x^T A)^T = A^T x$?