Gradient of the form $(\textbf{x}-\textbf{x}_k)^TA(\textbf{x}-\textbf{x}_k)$

Question

In the context of a convex optimization problem I came across with the following function:

$$f_1(\textbf{x})=(\textbf{x}-\textbf{x}_k)^T\textbf{A}(\textbf{x}-\textbf{x}_k) - t^2$$

EDIT
$f_1$ is a real-valued function and $t$ is a positive real number. Also $\textbf{A}$ is positive definite.

and I am trying to take its gradient following the rules I found here. Starting from the given form I get: \begin{eqnarray*} f_1(\textbf{x})&=&\textbf{x}^T\textbf{A}\textbf{x}-\textbf{x}^T\textbf{A}\textbf{x}_k-\textbf{x}_k^T\textbf{A}\textbf{x}+\textbf{x}_k^T\textbf{A}\textbf{x}_k-t^2\\ &=&\textbf{x}^T\textbf{A}\textbf{x}-2\textbf{x}_k^T\textbf{A}\textbf{x}+\textbf{x}_k^T\textbf{A}\textbf{x}_k-t^2\\ &=&\textbf{x}^T\textbf{A}\textbf{x}-2\textbf{q}^T\textbf{x}+\textbf{x}_k^T\textbf{A}\textbf{x}_k-t^2\\ \end{eqnarray*}

where in the last line I have considered a row vector $\textbf{q}^T=\textbf{x}_k^T\textbf{A}$

So what I get is: $$\nabla f_1(\textbf{x})=\textbf{A}\textbf{x}-2\textbf{q}=\textbf{A}\textbf{x}-2\textbf{A}\textbf{x}_k=\textbf{A}(\textbf{x}-2\textbf{x}_k)$$

But looking at my notes I see that the result is: $$\nabla f_1(\textbf{x})=2\textbf{A}(\textbf{x}-\textbf{x}_k)$$

Am I doing something wrong? Also, does it matter if $\textbf{A}$ is positive definite or not?

@mathcounterexamples.net Please see the edit. Thanks for pointing this out. — mgus, Nov 21 '15 at 14:41

Empiricist · Accepted Answer · 2015-11-21T15:04:32.877

3

Assuming $A$ is positive definite, i.e. $A+A^T$ is SPD. Then expanding the quadratic form we have

$$f_1(\mathbf{x}) = \mathbf{x}^TA\mathbf{x} - \mathbf{x}_k^T(A+A^T)\mathbf{x} + \mathbf{x}_k^T A \mathbf{x}_k - t^2.$$

The second-last term should have a positive sign although it has no effect in calculating the gradient.

In calculating the gradient of $\mathbf{x}^TA\mathbf{x}$, using the formula there we should have $$\nabla \mathbf{x}^TA\mathbf{x} = (A+A^T)\mathbf{x}, $$ and therefore $$\nabla f_1 = (A+A^T)\mathbf{x} - (A+A^T)\mathbf{x}_k = (A+A^T)(\mathbf{x}-\mathbf{x}_k).$$ In particular, if $A$ is SPD, then $$ \nabla \mathbf{x}^TA\mathbf{x} = 2A\mathbf{x}, $$ and $$\nabla f_1 = 2A(\mathbf{x}-\mathbf{x}_k).$$

edited Nov 21 '15 at 15:04

answered Nov 21 '15 at 14:43

Empiricist

7,933

The remark after "in particular" should come first, because the term $-2 \mathbf{q}^T \mathbf{x}$ is obtained using $A = A^T$. – A.P. Nov 21 '15 at 14:49
1

Thanks for your reminder. I messed up the orders since I thought SPD is given as a condition when I first wrote the answer. – Empiricist Nov 21 '15 at 14:56
@S.W.Cheung So what you are saying is that I also need to impose on $\textbf{A}$ to be symmetric? – mgus Nov 21 '15 at 15:01
@koursaros Yes. In fact, as pointed out by A.P., you have imposed the condition $A$ is symmetric in simplifying the second line in the expansion of the quadratic form to $2\mathbf{x}_k^T A \mathbf{x}$. Please read the edited answer for a more detailed explanation. – Empiricist Nov 21 '15 at 15:06

Gradient of the form $(\textbf{x}-\textbf{x}_k)^TA(\textbf{x}-\textbf{x}_k)$

1 Answers1