1

I'm trying to minimize $\frac{1}{2}||x - d||^2 + \lambda ||x||$ with respect to $x$ where the norm concerned is the $L_2$ norm, and $x$ and $d$ are vectors.

I think the answer I should be arriving at is $[1 - \frac{\lambda}{||d||}]_+ d$.

EDIT: In an attempt to answer my own question after learning up on subgradients: The optimality condition has $$0 \in x - d + \lambda \partial ||x|| $$ where $\partial$ is denoting the subgradient. It now branches off to two scenarios:

1) If $x=0$, then the optimality condition becomes $$0 \in -d + \lambda \{g : ||g||\leq 1 \}$$ Rearranging the terms yields that $||d|| \leq \lambda $. Thus, the minimizer in this case is $\hat{x} = 0$ when $||d|| \leq \lambda$.

2) If $x \neq 0$, then the optimality condition becomes $$ 0 = x - d + \lambda \frac{x}{||x||}$$ which implies $x = d - \lambda \frac{x}{||x||}$. The next step is $$ x = d - \lambda \frac{x}{||x||} \iff \hat{x} = d - \lambda \frac{d}{||d||} \tag{*}$$ How does one arrive at and intuit step $(*)$? I can verify that it is true, but do not know how I would have derived it had I not known the answer I'm supposed to get to. I can finish the rest, but I would really appreciate help on step $(*)$! Thanks in advance.

Royi
  • 8,711
iceberg
  • 135
  • 2
  • 12

2 Answers2

2

If you want to establish $x = d - \lambda \frac{x}{\|x\|}$, it implies that vector $d$ and $x$ have a same direction. That is to say $\frac{x}{\|x\|}$ and $\frac{d}{\|d\|}$ are equal unit vectors. They are interchangeable in step (*).

-1

One could see that the Support Function of the Unit Ball of $ {\ell}_{2} $ is given by:

$$ {\sigma}_{C} \left( x \right) = {\left\| x \right\|}_{2}, \; C = {B}_{{\left\| \cdot \right\|}_{2}} \left[0, 1\right] $$

The Fenchel's Dual Function of $ {\sigma}_{C} \left( x \right) $ is given by the Indicator Function:

$$ {\sigma}_{C}^{\ast} \left( x \right) = {\delta}_{C} \left( x \right) $$

Now, using Moreau Decomposition (Someone needs to create a Wikipedia page for that) $ x = \operatorname{Prox}_{\lambda f \left( \cdot \right)} \left( x \right) + \lambda \operatorname{Prox}_{ \frac{{f}^{\ast} \left( \cdot \right)}{\lambda} } \left( \frac{x}{\lambda} \right) $ one could see that:

$$ \operatorname{Prox}_{\lambda {\left\| \cdot \right\|}_{2}} \left( x \right) = \operatorname{Prox}_{\lambda {\sigma}_{C} \left( \cdot \right)} \left( x \right) = x - \lambda \operatorname{Prox}_{ \frac{{\delta}_{C} \left( \cdot \right)}{\lambda} } \left( \frac{x}{\lambda} \right) $$

It is known that $ \operatorname{Prox}_{ {\delta}_{C} \left( \cdot \right) } = \operatorname{Proj}_{C} \left( x \right) $, namely the Orthogonal Projection onto the set.

In the case above, of $ C = {B}_{{\left\| \cdot \right\|}_{2}} \left[0, 1\right] $ it is given by:

$$ \operatorname{Proj}_{C} \left( x \right) = \frac{x}{\max \left( \left\| x \right\|, 1 \right)} $$

Which yields:

$$ \begin{align} \operatorname{Prox}_{\lambda {\left\| \cdot \right\|}_{2}} \left( x \right) & = \operatorname{Prox}_{\lambda {\sigma}_{C} \left( \cdot \right)} \left( x \right) = x - \lambda \operatorname{Prox}_{ \frac{{\delta}_{C} \left( \cdot \right)}{\lambda} } \left( \frac{x}{\lambda} \right) \\ & = x - \lambda \operatorname{Prox}_{ {\delta}_{C} \left( \cdot \right) } \left( \frac{x}{\lambda} \right) \\ & = x - \lambda \operatorname{Proj}_{C} \left( \frac{x}{\lambda} \right) \\ & = x - \lambda \frac{x / \lambda}{ \max \left( {\left\| \frac{x}{\lambda} \right\|}_{2} , 1 \right) } = x \left( 1 - \frac{\lambda}{\max \left( {\left\| x \right\|}_{2} , \lambda \right)} \right) \end{align} $$

Remark
Copied from related answer of mine at Closed Form Solution of $ \arg \min_{x} {\left\| x - y \right\|}_{2}^{2} + \lambda {\left\|x \right\|}_{2} $ - Tikhonov Regularized Least Squares.

Royi
  • 8,711