Deriving Block Soft Threshold from $ {L}_{2} $ Norm (Prox Operator)

Question

I'm trying to minimize $\frac{1}{2}||x - d||^2 + \lambda ||x||$ with respect to $x$ where the norm concerned is the $L_2$ norm, and $x$ and $d$ are vectors.

I think the answer I should be arriving at is $[1 - \frac{\lambda}{||d||}]_+ d$.

EDIT: In an attempt to answer my own question after learning up on subgradients: The optimality condition has $$0 \in x - d + \lambda \partial ||x|| $$ where $\partial$ is denoting the subgradient. It now branches off to two scenarios:

1) If $x=0$, then the optimality condition becomes $$0 \in -d + \lambda \{g : ||g||\leq 1 \}$$ Rearranging the terms yields that $||d|| \leq \lambda $. Thus, the minimizer in this case is $\hat{x} = 0$ when $||d|| \leq \lambda$.

2) If $x \neq 0$, then the optimality condition becomes $$ 0 = x - d + \lambda \frac{x}{||x||}$$ which implies $x = d - \lambda \frac{x}{||x||}$. The next step is $$ x = d - \lambda \frac{x}{||x||} \iff \hat{x} = d - \lambda \frac{d}{||d||} \tag{*}$$ How does one arrive at and intuit step $(*)$? I can verify that it is true, but do not know how I would have derived it had I not known the answer I'm supposed to get to. I can finish the rest, but I would really appreciate help on step $(*)$! Thanks in advance.

Tikhonov regularization would help this minimization problem if the second term was $\lambda ||x||^2$. But I need it as $\lambda ||x||$ to exploit non-differentiability at zero (to encourage sparse solutions). — iceberg, Mar 20 '17 at 18:01
Have a look here - https://math.stackexchange.com/questions/1681658. — Royi, Mar 13 '18 at 15:02

Luo Zhiheng · Answer 1 · 2018-01-25T07:24:44.073

2

If you want to establish $x = d - \lambda \frac{x}{\|x\|}$, it implies that vector $d$ and $x$ have a same direction. That is to say $\frac{x}{\|x\|}$ and $\frac{d}{\|d\|}$ are equal unit vectors. They are interchangeable in step (*).

edited Jan 25 '18 at 07:24

answered Jan 24 '18 at 13:32

Luo Zhiheng

93

rather vague here... – QuIcKmAtHs Jan 24 '18 at 13:54

score -1 · Answer 2 · answered Mar 15 '18 at 21:27

One could see that the Support Function of the Unit Ball of $ {\ell}_{2} $ is given by:

$$ {\sigma}_{C} \left( x \right) = {\left\| x \right\|}_{2}, \; C = {B}_{{\left\| \cdot \right\|}_{2}} \left[0, 1\right] $$

The Fenchel's Dual Function of $ {\sigma}_{C} \left( x \right) $ is given by the Indicator Function:

$$ {\sigma}_{C}^{\ast} \left( x \right) = {\delta}_{C} \left( x \right) $$

Now, using Moreau Decomposition (Someone needs to create a Wikipedia page for that) $ x = \operatorname{Prox}_{\lambda f \left( \cdot \right)} \left( x \right) + \lambda \operatorname{Prox}_{ \frac{{f}^{\ast} \left( \cdot \right)}{\lambda} } \left( \frac{x}{\lambda} \right) $ one could see that:

$$ \operatorname{Prox}_{\lambda {\left\| \cdot \right\|}_{2}} \left( x \right) = \operatorname{Prox}_{\lambda {\sigma}_{C} \left( \cdot \right)} \left( x \right) = x - \lambda \operatorname{Prox}_{ \frac{{\delta}_{C} \left( \cdot \right)}{\lambda} } \left( \frac{x}{\lambda} \right) $$

It is known that $ \operatorname{Prox}_{ {\delta}_{C} \left( \cdot \right) } = \operatorname{Proj}_{C} \left( x \right) $, namely the Orthogonal Projection onto the set.

In the case above, of $ C = {B}_{{\left\| \cdot \right\|}_{2}} \left[0, 1\right] $ it is given by:

$$ \operatorname{Proj}_{C} \left( x \right) = \frac{x}{\max \left( \left\| x \right\|, 1 \right)} $$

Which yields:

$$ \begin{align} \operatorname{Prox}_{\lambda {\left\| \cdot \right\|}_{2}} \left( x \right) & = \operatorname{Prox}_{\lambda {\sigma}_{C} \left( \cdot \right)} \left( x \right) = x - \lambda \operatorname{Prox}_{ \frac{{\delta}_{C} \left( \cdot \right)}{\lambda} } \left( \frac{x}{\lambda} \right) \\ & = x - \lambda \operatorname{Prox}_{ {\delta}_{C} \left( \cdot \right) } \left( \frac{x}{\lambda} \right) \\ & = x - \lambda \operatorname{Proj}_{C} \left( \frac{x}{\lambda} \right) \\ & = x - \lambda \frac{x / \lambda}{ \max \left( {\left\| \frac{x}{\lambda} \right\|}_{2} , 1 \right) } = x \left( 1 - \frac{\lambda}{\max \left( {\left\| x \right\|}_{2} , \lambda \right)} \right) \end{align} $$

Remark
Copied from related answer of mine at Closed Form Solution of $ \arg \min_{x} {\left\| x - y \right\|}_{2}^{2} + \lambda {\left\|x \right\|}_{2} $ - Tikhonov Regularized Least Squares.

Deriving Block Soft Threshold from $ {L}_{2} $ Norm (Prox Operator)

2 Answers2

Linked