4

I am trying to find the derivative of the expression below using product rule but I am unable to do so. Below is my solution.

$ \frac{d}{dx}(x^TAx) \\= \frac{d}{dx}x^T(Ax) + (x^T)\frac{d}{dx}Ax \\ = Ax +x^TA$

Im unable to get $x^T(A+A^T)$. Help thanks.

RuiQi
  • 437

2 Answers2

4

With $f : \mathbb{R}^n \rightarrow \mathbb{R}$ given by \begin{equation} f(x) = x^T A x, \end{equation} our target is the gradient of $f$. Let $x \in \mathbb{R}^n$ be fixed and let $h \in \mathbb{R}^n$ be any vector and consider $\phi : \mathbb{R} \rightarrow \mathbb{R}^n$ given by \begin{equation} \phi(t) = f(x + th). \end{equation} Then \begin{multline} \phi(t) = (x+th)^TA(x+th) = x^TAx + x^TAth + th^TAx + th^TAth \\ = \phi(0) + t x^T Ah + t x^T A^T h + t^2 h^T A h = \phi(0) + t x^T(A + A^T)h + t^2 h^T A h, \end{multline} which implies \begin{equation} \frac{\phi(t) - \phi(0)}{t - 0} = x^T (A + A^T) h + t h^T Ah \rightarrow x^T (A + A^T)h, \quad t \rightarrow 0, \quad t \not = 0. \end{equation} By definition, this shows that $f$ is differentiable at $x$ and the gradient at $x$ is the linear map given by \begin{equation} \nabla_x f(h) = x^T(A+A^T) h. \end{equation} It is worth stressing that \begin{equation} h^T A x = h^T (Ax) = (Ax)^T h = (x^T A) h = x^T Ah, \end{equation} as this transformation plays a critical role in the argument.

Carl Christian
  • 12,583
  • 1
  • 14
  • 37
1

There is a problem in your last equality. $Ax$ is a column vector, and $x^T A$ is a row vector. So your summation there doesn't make sense. I think the issue comes from the derivative $\frac{d}{dt} x^T$. Check this bit again and I think once you find the right derivative your answer will drop out.

Josh R
  • 556
  • 1
    Hello, I managed to solve it using the product rule that states $ D[ f(x)^Tg(x)] = g(x)^Tf^{'}(x) + f(x)^Tg^{'}(x) $ but why is the transpose included ? I thought the product rule should be $ D[ f(x)g(x)] = g(x)f^{'}(x) + f(x)g^{'}(x) $ ?

    http://www4.ncsu.edu/~pfackler/MatCalc.pdf

    – RuiQi May 20 '16 at 09:06
  • 1
    The problem is that you are differentiating vector valued functions. Unfortunately expressions such as "$g(x) f'(x)$" don't even make sense unless you mean the dot product of the two vectors. But unfortunately (as far as I am aware) there is no such rule. – Josh R May 20 '16 at 09:14