2

I have the following optimization problem,

$$ \text{minimize}_{X,Y} \ \lVert X\rVert_* + \lambda \lVert Y\rVert_1 \\ \text{subject to }X + Y = C$$

$C \in \mathcal{R}^{m \times n}$, $\lVert Y\rVert_1$ denotes sum of absolute values of matrix entries ($\lambda \gt 0 $). $\lVert X\rVert_*$ denotes the nuclear norm of a matrix (sum of its singular values).

I am looking to find its dual problem. I know the usual procedure to deal with this, but cannot think much with this problem.

Any hint, suggests please.

mkuse
  • 195

1 Answers1

3

It's actually easier if we work with function notation for awhile and not norms. So let's define $f_1(X)=\|X\|_*$ and $f_2(Y)=\|Y\|_1$ and write it as $$\begin{array}{ll} \text{minimize}_{X,Y} & f_1(X) + \lambda f_2(Y) \\ & X + Y = C \end{array}$$ The Lagrangian is $$\begin{aligned} L(X,Y,Z) &= f_1(X) + \lambda f_2(Y) - \langle Z, X + Y - C \rangle \\ &= f_1(X) - \langle Z, X \rangle + \lambda f_2(Y) - \langle Z, Y \rangle + \langle Z, C \rangle \end{aligned}$$ The dual function is $$\begin{aligned} g(Z) = \inf_{X,Y} L(X,Y,Z) &= \inf_X f_1(X) - \langle Z, X \rangle + \inf_Y \lambda f_2(Y) - \langle Z, Y \rangle + \langle Z, C \rangle \\ &= \langle Z, C \rangle - \sup_X \left( \langle Z, X \rangle - f_1(X) \right) - \sup_Y \left( \langle Z, Y \rangle - \lambda f_2(Y) \right) \\ &= \langle Z, C \rangle - f_1^*(Z) - \lambda f_2^*(\lambda^{-1} Z) \end{aligned}$$ where $f_1^*$ and $f_2^*$ are the convex conjugates of $f_1$ and $f_2$, respectively. I hope you'll take my word that they are $$f_1^*(Z) = \begin{cases} 0 & \|Z\| \leq 1 \\ +\infty & \text{otherwise} \end{cases} \qquad f_2^*(Z) = \begin{cases} 0 & \|Z\|_\infty \leq 1 \\ +\infty & \text{otherwise} \end{cases}$$ because this answer is already long enough :-) Note the involvement of the dual norms here: $\|\cdot\|$ is the maximum singular value of $Z$, and $\|\cdot\|_\infty$ returns the maximum of the absolute values of the elements of its input. Refer pages 93, 221-222 in Boyd and Vandenberghe for further understanding (courtesy: comment by mkuse).

Putting this together, we have the dual problem $$\begin{array}{ll} \text{maximize}_Z & \langle C, Z \rangle - f_1^*(Z) - \lambda f_2^*(\lambda^{-1} Z) \end{array}$$ This is technically the correct dual. But since $f_1^*$ and $f_2^*$ are indicator functions we will typically convert them to constraints like this: $$\begin{array}{ll} \text{maximize}_Z & \langle C, Z \rangle \\ & \|Z\|\leq 1 \\ & \| Z \|_\infty \leq \lambda \end{array}$$ And that's the dual you're most likely going to want to work with.

If you take the dual of this problem, you'll eventually recover a problem that is equivalent to the primal, but it won't be exactly the same. You'll need a bit of transformation to get back to the exact original form. After all, we just did a bit of transformation ourselves to clean up the dual.

Hari
  • 239
Michael Grant
  • 19,450
  • When you write Lagrangian at the very beginning, why do you subtract the term involving the dual variable. Is it just a manipulation step, something like setting Z as -Z? – mkuse Sep 29 '14 at 04:05
  • 1
    Just to complete the answer, it is a known fact that convex conjugates of any norm is 0 if $||Z|| \le 1$, infinity otherwise. Refer to Boyd's Convex Optimization book, page 221-222 for the explanation. – mkuse Sep 29 '14 at 04:22
  • 1
    Proof on page 93 solved example 3.26 of the book. – mkuse Sep 29 '14 at 04:45
  • 1
    Technically, for an equality constraint, it doesn't matter---though it will change the dual objective to $-\langle C,Z\rangle$. I prefer to select the sign of an equality constraint term to avoid unnecessary negatives like that. But for inequality constraints, the sign does matter. You need to be subtracting a nonnegative term. So for example, if we had $X+Y\preceq C$, the term must be $-\langle Z,C-X-Y\rangle$. – Michael Grant Sep 29 '14 at 21:26