7

(This is from page 474 of Boyd & Vandenberghe's Convex Optimization, on algorithms for unconstrained minimization)

Assumptions

The function $f : \mathbb{R}^N \mapsto \mathbb{R}$ is convex and twice-differentiable and there exists an optimal point $x^*$ such that $f(x^*) \leq f(x)$ for all $x \in \text{dom}(f)$. Moreover, for $x_0$, which is the starting point for our algorithm we have that $S := \{ x \in \text{dom}(f) | \ f(x) \leq f(x_0) \}$ is a closed set. Finally it is assumed that $f$ is strongly convex on $S$, which means that there exists an $m > 0$ such that \begin{equation} \nabla^2 f(x) \succeq mI. \end{equation}

Claim

Because of strong convexity, we have for $x,y \in S$ \begin{equation} f(y) \geq f(x) + \nabla f(x)'(y-x) + \frac{m}{2} \parallel y-x \parallel^2_2 \end{equation} and this inequality implies that the sublevel sets in $S$ are bounded. I do not understand where this final claim on boundedness comes from. (To be clear, I understand this inequality, just not the implication on bounded sublevel sets)

My attempt

Take a sublevel set $S' = \{y\ | \ f(y) \leq f(x) \} \subset S$. Then for $y \in S'$ the given inequality implies \begin{equation} 0 \geq f(y) - f(x) \geq \nabla f(x)'(y-x) + \frac{m}{2} \parallel y-x \parallel^2_2 \end{equation} and then somehow use this to show $y$ is bounded. Any help would be appreciated, I must be overlooking something.

2 Answers2

9

You have

\begin{equation} f(y) \geq f(x) + \nabla f(x)'(y-x) + \frac{m}{2} \| y-x \|^2_2, \: \forall x,y \in S. \end{equation}

In particular, for $x=x^*,$ it is

\begin{equation} f(y) \geq f(x^*) + \nabla f(x^*)'(y-x^*) + \frac{m}{2} \| y-x^* \|^2_2, \: \forall y \in S. \end{equation} Since $x^*$ is a global minimum of $f$ it is $\nabla f(x^*)=0.$ That is,

\begin{equation} f(y) \geq f(x^*) + \frac{m}{2} \| y-x^* \|^2_2, \: \forall y \in S. \end{equation}

Now, by definition of $S,$ one gets

\begin{equation} f(x_0)\ge f(y) \geq f(x^*) + \frac{m}{2} \| y-x^* \|^2_2, \: \forall y \in S \end{equation} from where

\begin{equation} \| y-x^* \|^2_2\le \frac{2}{m}(f(x_0)- f(x^*)), \: \forall y \in S, \end{equation} which gives us the boundedness of $S.$

mfl
  • 29,399
2

To complete your attempted proof, you just need to apply the Cauchy-Schwarz inequality. Recall that for any vectors $u,v$, we have $|u' v|^2 \le \|u\|_2^2 \|v\|_2^2$. Taking the square root implies $-\|u\|_2 \|v\|_2 \le u'v \le \|u\|_2 \|v\|_2$. In particular: $$ u'v \ge -\|u\|_2 \|v\|_2 $$ Applying this to your equation with $u=\nabla f(x)$ and $v= y-x$, we get $$ \begin{aligned} 0 \ &\ge \nabla f(x)'(y-x) + \frac{m}{2} \| y-x \|^2_2\\ & \ge -\| \nabla f(x)\|_2 \|y-x\|_2 + \frac{m}{2} \| y-x \|^2_2 \end{aligned} $$ Then either $\|y-x\|_2$ is zero or else we can divide by it to get: $$ \|y-x\|_2 \le \frac{2}{m} \| \nabla f(x)\|_2. $$ Which implies that $y$ is bounded.

Daniel Fischer
  • 206,697
p.s.
  • 6,401
  • What if the norm of the gradient is unbounded? – badmax Apr 01 '18 at 01:37
  • You just need a single $x$ with finite gradient to prove the bound for all $y$. – p.s. Apr 01 '18 at 16:42
  • I think it would be a bit clearer if we took $x=x_0$. Since $f$ is differentiable, the gradient at $x_0$ is just some vector in $\mathbb R^N$. The norm of any vector in $\mathbb R^N$ is finite and then the inequality shows that the initial sublevel set is bounded. – Cm7F7Bb Oct 20 '20 at 20:30