5

The problem is given by:

$$ \arg \min_{x} \frac{1}{2} {\left\| x - y \right\|}_{2}^{2} + \lambda {\left\|x \right\|}_{2} $$

Where $y$ and $x$ are vectors. $\|\cdot\|_2$ is Euclidean norm. In the paper Convex Sparse Matrix Factorizations, they say the closed form solution is $x=\max\{y-\lambda \frac{y}{\|y\|_2}, 0\}$. I don't know why $x$ need to be non-negative. I think it may come from $\|x\|_2=\sqrt{x^Tx}$. But I cannot derive it. Please help.

The statement appears in the last paragraph line 2 on page 5 of the Paper.

Royi
  • 8,711
E.J.
  • 939
  • If $x$ is a vector, what do you mean by "need to be non-negative". In the same way, which is the notion of $\max{u,v}$ if $u$ and $v$ are vectors. Finally, which is your vector space? – sinbadh Mar 03 '16 at 17:16
  • There is certainly no reason why $x$ should be forced to be non-negative. You're either misreading the paper or the paper is wrong. – Michael Grant Mar 03 '16 at 17:22
  • 1
    @MichaelGrant I added the link of the paper. Can you take a look at it? – E.J. Mar 03 '16 at 17:51
  • @MichaelGrant The notation here "$x$" denotes the magnitude of the vector $\vec x$, which is certainly non-negative. ;-)) – Mark Viola Mar 03 '16 at 17:54

2 Answers2

3

That's not what the referenced paper says. It gives an expression which is equivalent to the proximal operator of the $\ell_2$ norm:

$$ \DeclareMathOperator*{\argmin}{arg\,min} \argmin_x \frac{1}{2}\|x-y\|^2 + \lambda\|x\| = \max(\|y\|-\lambda,0)\frac{y}{\|y\|} $$ Note the vector $y$ is not inside the maximum.

I'll sketch a proof. We can decompose $x$ as sum of two components, one parallel to $y$ and one orthogonal to $y$. That is, let $ x = t \frac{y}{ \| y\| } + z $ where $y^T z=0$. Then the objective reduces to:

$$\frac{1}{2}\|x-y\|^2 + \lambda\|x\| = \frac{1}{2}\|z\|^2 + \frac{1}{2}(t-\|y\|)^2 + \lambda \sqrt{t^2 + \|z\|^2}$$ Clearly the expression is minimized when $z=0$, so the problem reduces to a 1-dimensional problem: $$ \min_t \frac{1}{2}(t-\|y\|)^2 + \lambda |t| $$ Then it's a basic exercise in calculus to show that the objective is minimized when $t=\max(\|y\|-\lambda,0)$.

Royi
  • 8,711
p.s.
  • 6,401
  • Why cannot we just get the derivative and set it to 0 to get the result? – E.J. Mar 07 '16 at 17:16
  • You more or less can do that, but you need to mind the fact that the derivative doesn't exist at $x=0$. Also, how else do you solve $x-y+\lambda x/|x| = 0$ for $x$ other than by proving $x/|x|=y/|y|$? – p.s. Mar 08 '16 at 02:13
  • I see. Only the way you do it can yield the closed form solution. Thanks so... much:) – E.J. Mar 08 '16 at 02:31
  • Really nice trick there! – Royi Jan 11 '17 at 22:56
  • @p.s., any thoughts on the solution of https://math.stackexchange.com/questions/2263447. I think something is wrong there. See my remark to the answer. – Royi Jul 31 '19 at 04:12
2

One could see that the Support Function of the Unit Ball of $ {\ell}_{2} $ is given by:

$$ {\sigma}_{C} \left( x \right) = {\left\| x \right\|}_{2}, \; C = {B}_{{\left\| \cdot \right\|}_{2}} \left[0, 1\right] $$

The Fenchel's Dual Function of $ {\sigma}_{C} \left( x \right) $ is given by the Indicator Function:

$$ {\sigma}_{C}^{\ast} \left( x \right) = {\delta}_{C} \left( x \right) $$

Now, using Moreau Decomposition (Someone needs to create a Wikipedia page for that) $ x = \operatorname{Prox}_{\lambda f \left( \cdot \right)} \left( x \right) + \lambda \operatorname{Prox}_{ \frac{{f}^{\ast} \left( \cdot \right)}{\lambda} } \left( \frac{x}{\lambda} \right) $ one could see that:

$$ \operatorname{Prox}_{\lambda {\left\| \cdot \right\|}_{2}} \left( x \right) = \operatorname{Prox}_{\lambda {\sigma}_{C} \left( \cdot \right)} \left( x \right) = x - \lambda \operatorname{Prox}_{ \frac{{\delta}_{C} \left( \cdot \right)}{\lambda} } \left( \frac{x}{\lambda} \right) $$

It is known that $ \operatorname{Prox}_{ {\delta}_{C} \left( \cdot \right) } = \operatorname{Proj}_{C} \left( x \right) $, namely the Orthogonal Projection onto the set.

In the case above, of $ C = {B}_{{\left\| \cdot \right\|}_{2}} \left[0, 1\right] $ it is given by:

$$ \operatorname{Proj}_{C} \left( x \right) = \frac{x}{\max \left( \left\| x \right\|, 1 \right)} $$

Which yields:

$$ \begin{align} \operatorname{Prox}_{\lambda {\left\| \cdot \right\|}_{2}} \left( x \right) & = \operatorname{Prox}_{\lambda {\sigma}_{C} \left( \cdot \right)} \left( x \right) = x - \lambda \operatorname{Prox}_{ \frac{{\delta}_{C} \left( \cdot \right)}{\lambda} } \left( \frac{x}{\lambda} \right) \\ & = x - \lambda \operatorname{Prox}_{ {\delta}_{C} \left( \cdot \right) } \left( \frac{x}{\lambda} \right) \\ & = x - \lambda \operatorname{Proj}_{C} \left( \frac{x}{\lambda} \right) \\ & = x - \lambda \frac{x / \lambda}{ \max \left( {\left\| \frac{x}{\lambda} \right\|}_{2} , 1 \right) } = x \left( 1 - \frac{\lambda}{\max \left( {\left\| x \right\|}_{2} , \lambda \right)} \right) \end{align} $$

Royi
  • 8,711