0

I'm trying to calculate the gradient with respect to x of the following function $$g(x) = x^TP^TPx + q^Tx + r$$ with $x,q \in R^n, P \in R^{nxn}$ and P full rank

I've been using this answer as its kind of similar https://math.stackexchange.com/a/20712/525513. Please forgive me as I am not very experienced with vector calculus.

I create intermediate variables $P^TPx = u$ and $x^TP^TP = v$, both vectors. Next I try to use the chain rule in a similar way as the answer linked above with $x^Tu$ and $vx$. $$\frac{dg}{dx} = \begin{bmatrix} \frac{\partial g}{\partial x^T} & \frac{\partial g}{\partial x} \end{bmatrix} \begin{bmatrix} \frac{\partial x^T}{\partial x^T} \\ \frac{\partial x}{\partial x} \end{bmatrix} = \frac{\partial g}{\partial x^T} + \frac{\partial g}{\partial x} $$ Now $\frac{\partial g}{\partial x^T} = \frac{\partial}{\partial x^T}[x^Tu] = u$

and $\frac{\partial g}{\partial x} = \frac{\partial}{\partial x}[vx] = v$

and $\frac{\partial}{\partial x}[q^Tx] = q^T$

So I have $\frac{dg}{dx} = u + v + q^T = P^TPx + x^TP^TP + q^T$

Is this correct?

gary69
  • 103

1 Answers1

1

Note that the scalar product of a real matrix $A$ and two real column vectors $x,y$ can be transposed without affecting its value, i.e. $$g = y^TAx \;=\; g^T = x^TA^Ty$$ Consequently its differential can be written as $$dg = y^TA\,dx + x^TA^Tdy$$ Setting $\,y=x,\;A=P^TP\;$ and adding the $\,q^Tx\,$ term recovers the original problem. $$\eqalign{ dg &= x^TP^TP\,dx + x^TP^TP\,dx + q^Tdx \\ &= \Big(2x^TP^TP+q^T\Big)\,dx \\ }$$ Therefore the gradient must be the term in parentheses $-$ or its transpose depending upon your preferred layout convention.

greg
  • 35,825