7

Let $$V(x):=a+b^{T}x+\frac{1}{2}x^{T}Cx$$ for some $a \in \mathbb{R}$, $b \in \mathbb{R}^{n}$, $C \in \mathbb{R}^{nxn}$ that for $V$ to have a strict unique minimum it is imperative that $C>0$.


I have attempted to solve this multiple times and I am very confused about the proof. Please help me solve it.

2 Answers2

3

We know that:

  1. A twice differentiable function of several variables is strictly convex on a convex set if its Hessian matrix is positive definite on the interior of the convex set.

  2. Any local minimum of a convex function is also a global minimum

  3. A strictly convex function will have at most one global minimum.

So, basically, to guarantee that $V$ has a unique minimum we need its Hessian to be positive definite.

We have that $x = \left( {{x_1}, \ldots ,{x_n}} \right) \in {\mathbb{R}^n}$, so $V = V\left( x \right) = V\left( {{x_1}, \ldots ,{x_n}} \right).$

$$V\left( x \right) = a + {b^T}x + \frac{1}{2}{x^T}Cx = a + \sum\limits_{i = 1}^n {{b_i}{x_i}} + \frac{1}{2}\sum\limits_{i = 1}^n {\sum\limits_{j = 1}^n {{c_{ij}}{x_i}{x_j}} } $$

$$\frac{\partial }{{\partial {x_k}}}V\left( x \right) = {b_k} + \frac{1}{2} \cdot 2\sum\limits_{i = 1}^n {{c_{ki}}{x_i}} $$

$$\frac{\partial }{{\partial {x_l}}}\left( {\frac{\partial }{{\partial {x_k}}}V\left( x \right)} \right) = {c_{kl}}$$

Thus, the Hessian of $V$, which by definition has entries $${\left( {{H_V}\left( x \right)} \right)_{i,j}} = \frac{\partial }{{\partial {x_j}}}\left( {\frac{\partial }{{\partial {x_i}}}V\left( x \right)} \right)$$ is $${H_V}\left( x \right) = C.$$

Hence, for $V$ to have a unique global minimum, $C$ has to be positive definite.

  • Thank You very much, I was wondering what you ment by $d/d_{k}$ and by $b_{k}$ ? Why do we take a double derivative? Can't we take one derivative and conclude there is a minimum when it equals $0$. So $d/dx_{k} V(x)=b_{k} +1/2 *2 Cx = 0$. Then we can say that this occurs when $x=-C^{-1}b$ and re plug this back into the first equation? I am just a bit confused how to continue from there. – TorqueNoFriction Feb 01 '14 at 22:30
  • By $\frac{\partial }{{\partial {x_k}}}$ I mean the partial derivative with respect to $x_k$. Bear in mind that $V$ is a function of $x$ and $x =({x_1}, \ldots ,{x_n}) \in {\mathbb{R}^n}$ is a vector, so $V$ is a function of several variables! $\nabla V( x ) =( {\frac{{\partial V}}{{\partial {x_1}}},\frac{{\partial V}}{{\partial {x_n}}}, \ldots ,\frac{{\partial V}}{{\partial {x_n}}}}) =(0,0, \ldots ,0)$ is just a necessary condition, that is: if you know there is a minimum then this necessarily happens, but just the fact it happens is not enough to guarantee that a minimum exists. – etothepitimesi Feb 01 '14 at 22:47
  • Thank You. I am still confused on what the second derivative is proving? – TorqueNoFriction Feb 01 '14 at 23:04
  • Do you know what the Hessian of a function of several variables is? Perhaps you should read more about it: http://en.wikipedia.org/wiki/Hessian_matrix

    The partial derivatives of second order are used to compute the Hessian. If the Hessian of a function is positive definite in a convex set, then the function is strictly convex; and when know that a function is strictly convex then, if it has one minimum, that minimum is unique.

    – etothepitimesi Feb 01 '14 at 23:11
  • Do you need the second result? "Any local minimum of a convex function is also a global minimum" – user1868607 May 18 '19 at 09:23
  • @etothepitimesi sorry for reviving this discussion but you say "A twice differentiable function of several variables is strictly convex on a convex set if and only if its Hessian matrix is positive definite on the interior of the convex set." THIS IS FALSE! equivalence holds only for convex functions and it does not hold for strictly convex functions : The strict convexity of a function $f$ does not imply that its Hessian is everywhere positive definite. As an example consider the function $f : \mathbb{R} \to \mathbb{R}$ , $ f(x) = x^4$ This function is strictly convex, but $f''(0)=0$. – wessi Mar 06 '22 at 14:55
  • @jacques99, you are correct. I'm not sure why I wrote condition 1 as being necessary and sufficient, back in 2014. I have amended my answer. thank you! – etothepitimesi Mar 11 '22 at 09:35
0

I'll start with the assumption that $C$ is symmetric, so that it has an orthonormal basis $\{ v_{1},\cdots,v_{n}\}$ of eigenvectors with corresponding eigenvalues $\{\lambda_{1},\cdots,\lambda_{n}\}$. Then, using this basis, $$ V(\alpha_{1}v_{1}+\cdots+\alpha_{n}v_{n})=a+\sum_{j=1}^{n}\beta_{j}\alpha_{j}+\frac{1}{2}\sum_{j=1}^{n}\lambda_{j}\alpha_{j}^{2}, \; \mbox{ where } \beta_{j}=b^{T}v_{j}. $$ If one eigenvalue $\lambda_{k}$ is strictly negative, then you cannot have a minimum because the linear terms become neglible in the following limit: $$ \lim_{\alpha_{k}\rightarrow\infty}V(\cdots)=-\infty .$$ So it is necessary that $C \ge 0$ in order to have a minimum. If $\lambda_{k}=0$ for some $k$, then $V$ will not have a maximum or a minimum if $\beta_{k}\ne 0$ because, in such a case, $(V-a)=\beta_{k}\alpha_{k}$ is linear in $\alpha_{k}$ while keeping the other $\alpha_{j}$ fixed at 0. If $\lambda_{k}=0$ and $\beta_{k}=0$, then you cannot have a unique minimum because varying $\alpha_{k}$ won't affect the expression for $V$ at all. So, an absolute minimum requires $C > 0$. And you can show that such an absolute minimum exists in that case because, assuming $\alpha_{j}\ne 0$ for all $j$, allows you write $$ V(\alpha_1 v_1+\cdots+\alpha_n v_n)=\sum_{j=1}^{n}\lambda_j\left(\alpha_j+\frac{\beta_{j}}{2\lambda_{j}}\right)^{2}+K, $$ where $K$ is a constant which does not depend on the $\alpha_{j}$. Clearly the above has an absolute minimum if $\lambda_{j} > 0$ for all $j$.

Disintegrating By Parts
  • 87,459
  • 5
  • 65
  • 149