With $f : \mathbb{R}^n \rightarrow \mathbb{R}$ given by
\begin{equation}
f(x) = x^T A x,
\end{equation}
our target is the gradient of $f$. Let $x \in \mathbb{R}^n$ be fixed and let $h \in \mathbb{R}^n$ be any vector and consider $\phi : \mathbb{R} \rightarrow \mathbb{R}^n$ given by
\begin{equation}
\phi(t) = f(x + th).
\end{equation}
Then
\begin{multline}
\phi(t) = (x+th)^TA(x+th) = x^TAx + x^TAth + th^TAx + th^TAth \\ = \phi(0) + t x^T Ah + t x^T A^T h + t^2 h^T A h = \phi(0) + t x^T(A + A^T)h + t^2 h^T A h,
\end{multline}
which implies
\begin{equation}
\frac{\phi(t) - \phi(0)}{t - 0} = x^T (A + A^T) h + t h^T Ah \rightarrow x^T (A + A^T)h, \quad t \rightarrow 0, \quad t \not = 0.
\end{equation}
By definition, this shows that $f$ is differentiable at $x$ and the gradient at $x$ is the linear map given by
\begin{equation}
\nabla_x f(h) = x^T(A+A^T) h.
\end{equation}
It is worth stressing that
\begin{equation}
h^T A x = h^T (Ax) = (Ax)^T h = (x^T A) h = x^T Ah,
\end{equation}
as this transformation plays a critical role in the argument.
http://www4.ncsu.edu/~pfackler/MatCalc.pdf
– RuiQi May 20 '16 at 09:06