1

I am struggling with a basic matrix calculus question.

Suppose you had $f(\textbf{x}) = (\textbf{a} - \textbf{Bx})^T(\textbf{a} - \textbf{Bx})$

where $\textbf{a}$ and $\textbf{B}$ are a vector and matrix of constants respectively. I am interested in finding the gradient of $f$ with respect to $\textbf{x}$. I would appreciate it if someone could write down the steps needed to compute this gradient.

I know that $d(\textbf{a} - \textbf{Bx})/d\textbf{x} = -\textbf{B}^T$, but I don't know how to apply the derivative operator to the product of $(\textbf{a} - \textbf{Bx})$ with itself.

EDIT: the gradient and the derivative are not the same thing. I am interested only in the gradient.

Sam
  • 1,013

1 Answers1

0

Recall that transposing a scalar leaves it unchanged.
Then consider the scalar function $$\eqalign{ f &= y^Tz \\ df &= dy^Tz + y^Tdz \quad &({\rm product\, rule}) \\ &= z^Tdy + y^Tdz \quad &({\rm transposed\, first\, term}) \\ }$$ Now set $z=y\,$ to obtain $$\eqalign{ df &= 2y^Tdy \\ }$$ Now let $y=(Bx-a)\,$ and therefore $\,dy=B\,dx$ $$\eqalign{ df &= 2y^TB\,dx = (2B^Ty)^T\,dx \\ }$$ So the gradient is $$\eqalign{ \frac{\partial f}{\partial x} &= (2B^Ty) = 2B^T(Bx-a) \\ }$$

greg
  • 35,825