0

A scalar valued function is defined as $f(x)=x^TAx+b^Tx+c$ , where $A$ is a symmetric positive definite matrix with dimension $n\times n$ ; $b$ and $x$ are vectors of dimension $n\times 1$. Show that the minimum value of $f(x)$ will occur when $x$ equals to $-\frac{A^{-1}b}{2}$.

I found the answer of the same here. But I an unable to get the partial derivatives. Please any one explain the solution.


Related:

nmasanta
  • 9,222
  • The derivative of $x^TAx$ w.r.t. x is $2Ax$ and the derivative of $b^Tx=$ w.r.t $x$ is $b$. Thus the equation is $2Ax+b=0$. It remains to solve for $x$. For the derivatives see here on page 10. – callculus42 Dec 24 '21 at 17:20
  • 1
    @callculus42 You're actually confusing derivative and gradient. – Ted Shifrin Dec 24 '21 at 17:58
  • @TedShifrin I don´t see any difference. Anyway you've posted an answer. Btw, Merry Christmas. – callculus42 Dec 24 '21 at 18:38
  • 1
    One is the transpose of the other. The derivative of a map $f\colon\Bbb R^n\to\Bbb R$ is a linear map from $\Bbb R^n$ to $\Bbb R$, hence a $1\times n$ matrix; the gradient is a usual vector. Not a big deal, but this confuses lots of people. Happy holidays to you too. – Ted Shifrin Dec 24 '21 at 18:39

2 Answers2

4

Here is a solution with no calculus at all. It's the same process of completing the square that we learn in the first algebra course in high school with some linear algebra thrown in.

Let $q=\frac12 A^{-1}b$. Note that $$f(x) = (x+q)^\top A (x+q) + (c-q^\top Aq).$$ Notice that we use symmetry of $A$ to get $q^\top Ax= \frac12 b^\top (A^{-1}Ax) = \frac12 b^\top x$ and similarly for $x^\top Aq$. Since $A$ is positive definite, $y^\top Ay\ge 0$, with equality holding if and only if $y=0$. Thus, $f$ attains its minimum when $x+q=0$, i.e., when $x=-q=-\frac12 A^{-1}b$.

Ted Shifrin
  • 115,160
  • No doubt your way of solving the problem is unique. But the main problem (that I think for this case) is if the term $\frac 12 A^{-1}b$ is not mentioned there, then how we assume $q=\frac 12 A^{-1}b$ ? Is there any other clue ? – nmasanta Dec 25 '21 at 01:49
  • No, far from unique. Very standard. You want to pick $q$ so that the cross-terms in the quadratic give precisely the linear term. Just like in high school algebra. – Ted Shifrin Dec 25 '21 at 01:55
  • This is a nice solution @Ted Shifrin. Not super standard but maybe it should be! nmasanta if I were trying to do it this way I would first write $q$ as an unknown and look set $(x+q)^T A (x+q) + Q$ = f(x)$ where $Q$ must be independent of $x$. That's sort of how I work through completing the squares in practice - I never remember the formula! – idl Dec 26 '21 at 04:17
0

At the minimum the derivative will be zero. The derivative of $c$ will be zero, because it is a constant. The derivative of $x$ is $I_n$, so the derivative of $b^Tx$ is $b^T$ (numerator) resp $b$ (denominator).

The only tricky bit is the quadratic term.

If $y = Ax, \frac{\partial y}{\partial x} = A$ so, by the product rule, $\frac{\partial x^TAx}{\partial x} = \frac{\partial x^T}{\partial x}Ax+x^TA$

$\frac{\partial x^T}{\partial x}Ax=x^TA^T$ (see, e.g. your Wikipedia link), so collecting the terms in $x^T$, $\frac{\partial x^TAx}{\partial x} = x^T(A+A^T)=2x^TA$ (since $A$ is symmetric, $A=A^T$)

For it all to be equal to zero $2x^TA+b=0 \Rightarrow x^T=\frac{A^-1b}{2}$

AlDante
  • 302
  • 1
  • 10
  • Note that this is exactly equivalent to the standard derivative case: $\frac{d}{dx} (ax^2 + bx + c) = 2ax + b$, which is a minimum when $x=\frac{-b}{2a}=\frac{-ba^{-1}}{2}$ – AlDante Dec 24 '21 at 21:01
  • 2
    The derivative of $x$ is $I_n$. The derivative of $x^T$ is $x \mapsto x^T$. – Trevor Gunn Dec 24 '21 at 21:16
  • @TrevorGunn Thank you - I have corrected the first and am trying to simplify the explanation of the second. – AlDante Dec 24 '21 at 22:19