1

I asked how to solve this optimization here. I found this approach by combining @Royi's idea in his answer with KKT's conditions. Personally, I feel my formulation is clearer and easier to understand.

Could you please verify if my proof is correct or contains logical mistake? Thank you so much!

Let $[\![ p ]\!]:=\{1,\ldots,p\}$ and $(x_1,\ldots,x_p) \in \mathbb R^p$. Solve the constrained optimization problem $$\begin{align*} \text{min} &\quad \frac{1}{2}\sum_{i=1}^p (y_i-x_i)^2 \\ \text{s.t} &\quad \sum_{i=1}^p y_i - 1 &&=0\\ &\quad\forall i \in [\![ p ]\!]: -y_i &&\le 0 \end{align*}$$

$\textbf{My attempt}$ Define $$\begin{aligned} f(y) &= \frac{1}{2}\sum_{i=1}^p (y_i-x_i)^2 \\ h(y) &= \sum_{i=1}^p y_i - 1 \\ \forall i \in [\![ p ]\!]: g_i(y) &= -y_i \end{aligned}$$ We have $f,g_i$ are convex and $h$ is linear. Let $a =(1/p, \ldots, 1/p)$. Then $h(a)=0$ and $g(a) <0$ for all $i \in [\![ p ]\!]$. It follows that Slater's condition is qualified. By Karush-Kuhn-Tucker conditions, we have $$\begin{aligned} \begin{cases} \forall i \in [\![ p ]\!]:\mu_i &\ge 0 \\ \forall i \in [\![ p ]\!]: g_i(y) &\le 0\\ h(y) &=0 \\ \forall i \in [\![ p ]\!]:\mu_i g_i(y)&=0 \\ \nabla f (y)- \lambda\nabla h (y)+ \mu_i \nabla g_i (y) &=0 \end{cases} &\iff \begin{cases} \forall i \in [\![ p ]\!]:\mu_i &\ge 0 \\ \forall i \in [\![ p ]\!]:-y_i &\le 0\\ \sum_{i=1}^p y_i - 1&=0 \\ \forall i \in [\![ p ]\!]: -\mu_i y_i &=0 \\ \forall i \in [\![ p ]\!]: (y_i - x_i) -\lambda - \mu_i &= 0 \end{cases} \end{aligned}$$

If $x_i+\lambda = 0$ then $y_i=\mu_i =0$ and thus $y_i = (x_i+\lambda)_+$. If $x_i+\lambda > 0$ then $y_i>0$ and thus $\mu_i=0$. Then $y_i = (x_i+\lambda)_+$. If $x_i+\lambda < 0$ then $\mu_i>0$ and thus $y_i=0$. Then $y_i = (x_i+\lambda)_+$. As such, we always have $y_i = (x_i+\lambda)_+$.

Then $\sum_{i=1}^p y_i - 1=0 \iff \sum_{i=1}^p (x_i+\lambda)_+ - 1=0$. Notice that $(x_i+\lambda)_+ = \max \{x_i+\lambda,0\}$ is continuous in $\lambda$ for all $i \in [\![ p ]\!]$. Hence $\psi(\lambda) = \sum_{i=1}^p (x_i+\lambda)_+ - 1$ is continuous in $\lambda$. Let $\alpha = -\max_{i \in [\![ p ]\!]}|x_i|$ and $\beta =1+ \max_{i \in [\![ p ]\!]}|x_i|$. It follows that $\psi(\alpha)<0<\psi(\beta)$. By Intermediate Value Theorem, the equation $\psi(\lambda)=0$ has a solution. Notice that $\psi$ is strictly increasing, so such solution is unique. We can also solve that equation by applying Intermediate Value Theorem on the interval $[\alpha , \beta]$.

Akira
  • 17,367
  • This optimisation problem can be solved in at most $n$ steps. You don't need Newton's method. The key is that projection onto a convex set $C$ is the same as projecting onto the affine hull first and then projecting on to $C$. When you project onto the affine hull you can drop inactive constraints (at least one of not optimal) and repeat. – copper.hat Feb 28 '20 at 17:57
  • Thank you @copper.hat! I got your point about the complexity of the algorithm. Do you feel my proof is fine? – Akira Feb 28 '20 at 18:02
  • I am not exactly sure what you are proving? – copper.hat Feb 28 '20 at 18:08
  • Ah @copper.hat I meant my solution/approach to solve the system of equations from KKT. – Akira Feb 28 '20 at 18:14
  • Sorry, I might be slow this morning, but I am not seeing a method above. The existence of a solution is known because the cost is continuous and the feasible set is compact. – copper.hat Feb 28 '20 at 18:26
  • @copper.hat I meant am I correct that we solve the minimization problem by solving $\sum_{i=1}^p (x_i+\lambda)_+ - 1=0$ ;) – Akira Feb 28 '20 at 18:34
  • Hi @Royi, in this question, I ask for proof verification. It's not the same as the other one. – Akira Feb 28 '20 at 19:06
  • But your solution is exactly what I did in https://math.stackexchange.com/questions/2402504. So I am not sure what you're doing. – Royi Feb 28 '20 at 19:07
  • Honestly, I'm unable to follow your logic in that thread @Royi. I can not understanding "The trick is to leave non negativity constrain implicit" in my sense of KKT theorem. I meant your solution does match how I understand KKT. – Akira Feb 28 '20 at 19:12
  • But this is exactly what you do here :-). – Royi Feb 28 '20 at 19:13
  • @Royi but I don't leave any constrain implicit ^^. I just wrote down the KKT system of equation and solved it. Of course, how I solved it is inspired by your expression $(x_i+\lambda)_+$ ;) – Akira Feb 28 '20 at 19:20
  • 1
    I need more time to look at it, your equation may be correct. However it is not differentiable, so you cannot use Newton's method. Also, there is a simpler way, but I need some time to resurrect some old memories after I finish work. – copper.hat Feb 28 '20 at 19:31
  • Ah @copper.hat, I meant to use Intermediate Value Theorem to solve the equation $\psi(\lambda)=0$ on the interval $[\alpha , \beta]$. – Akira Feb 28 '20 at 19:40

0 Answers0