3

Let scalar field $f : \mathbb{R}^n \to \mathbb{R}$ be given by $$f(x) = x^TA^TAx - \lambda( x^T x - 1)$$ where $A$ is an $n \times n$ matrix and $\lambda$ is a scalar. How to compute the gradient $\nabla f$?

I know that it would be the Jacobian matrix (or gradient), but is there a faster way to compute it rather than writing out everything in terms of components and taking partial derivatives?

The reason I ask is my classmate, when we considered this function in class, was able to compute the derivative as $A^T A x - \lambda x$ rather quickly.

Kuhndog
  • 335

1 Answers1

4

We write $f$ in this form $$f(x)=\langle Ax,Ax\rangle-\lambda(\langle x,x\rangle-1)$$

and since $$g: \mathbb R^n\rightarrow \mathbb R^n,\quad v\mapsto Av $$ is a linear map then $$Dg(v)x=Ax$$ and since $$h:\mathbb R^n\times \mathbb R^n\rightarrow \mathbb R,\quad (u,v)\mapsto \langle u,v\rangle$$ is a bilinear map then $$Dh(u,v)(x,y)=\langle u,y\rangle+\langle x,v\rangle$$ and by $$f(x)=h(g(x),g(x))-\lambda(h(x,x)-1)$$ and the chain rule formula we find $$Df(x)(v)=Dh(g(x),g(x)(Av,Av)-\lambda Dh(x,x)(v,v)\\=(\langle Ax,Av\rangle+\langle Av,Ax\rangle)-\lambda(\langle x,v\rangle+\langle v,x\rangle)=2\langle Ax,Av\rangle-2\lambda\langle x,v\rangle=2\langle A^TAx-\lambda x,v\rangle$$ and finally we have $$f'(x)=Df(x)=2(A^TAx-\lambda x)$$