Equivalence of computing trace norm of matrix

Question

Let $X\in \mathbb{R}^{m\times n}$. How to show that the trace norm $\|X\|_\text{tr}$, which is defined as the sum of the singular values of $X$, is equivalent to the following optimization problem?

\begin{array}{ll} \mathop{\text{maximize}}\limits_{Y\in\mathbb{R}^{m\times n}} & \text{tr}(X^TY)\\ \text{subject to} & Y^TY \preceq I_n\end{array}

See section 4.2.3.2 of MOSEK Modeling Cookbook https://docs.mosek.com/modeling-cookbook/sdo.html . — Mark L. Stone, Apr 01 '18 at 19:42
Thanks! This is what I want. But under equation (19), Is $\sup_{|Z|2\leq 1} \mathrm{tr}(X^T Z) = \sup{|X|2\leq 1} \mathrm{tr}(\Sigma^T U^T ZV)$ a typo? I think it should be $\sup{|Z|_2\leq 1} \mathrm{tr}(\Sigma^T U^T ZV)$. — david, Apr 02 '18 at 01:37
I believe you are correct. The next line then follows because $U^T$ and $V$ both have 2-norm = 1, so "unitary invariance" applies. — Mark L. Stone, Apr 02 '18 at 01:58
Michael Grant wrote about it here. Note that $|Q|_2 \leq 1$ is equivalent to $Q^\top Q \preceq I$. — Rodrigo de Azevedo, Apr 02 '18 at 09:40

score 1 · Accepted Answer · answered Apr 02 '18 at 02:55

I don't know if it is OK in stackexchange but I am trying to answer my own question to explain more about the derivation of the equivalence of trace norm. The reference is mostly from section 4.2.3.2 of MOSEK Modeling Cookbook.

We first prove that $||A||_{2} \le 1$ is equivalent to $A^{T} A\preceq I$. \begin{align*} & ||A||_{2} \le 1\\ \Leftrightarrow \quad & \max_{||x||_{2} =1} ||Ax||_{2} \le 1\\ \Leftrightarrow \quad & \max_{||x||_{2} =1} x^{T} A^{T} Ax\le 1\\ \Leftrightarrow \quad & \max_{||x||_{2} =1} x^{T}\left( A^{T} A-I\right) x\le 0\\ \Leftrightarrow \quad & A^{T} A\preceq I \end{align*} Then our problem \begin{array}{ll} \mathop{\text{maximize}}\limits_{Y\in\mathbb{R}^{m\times n}} & \text{tr}(X^TY)\\ \text{subject to} & Y^TY \preceq I_n\end{array} becomes $$\sup _{||Y||_{2} \leq 1}\mathrm{tr} (X^{T} Y)$$ Then we do SVD decomposition to $X$, $X=U\Sigma V^T$, \begin{equation*} \begin{aligned} \sup _{||Z||_{2} \leq 1}\mathrm{tr} (X^{T} Z) & =\sup _{||Z\| _{2} \leq 1}\mathrm{tr} (V\Sigma ^{T} U^{T} Z)\\ & =\sup _{||Z\| _{2} \leq 1}\mathrm{tr} (\Sigma ^{T} U^{T} ZV)\\ & =\sup _{\| U^{T} ZV\| _{2} \leq 1}\mathrm{tr} (\Sigma ^{T} U^{T} ZV)\\ & =\sup _{\| Y\| _{2} \leq 1}\mathrm{tr} (\Sigma ^{T} Y) \end{aligned} \end{equation*} $\mathrm{tr} (V\Sigma ^{T} U^{T} Z) = \mathrm{tr} (\Sigma ^{T} U^{T} ZV)$ follows from the invariant under cyclic permutations property of trace. And the equivalence of constraint $||Z||_2 \le 1 \Leftrightarrow ||U^T Z V ||_2 \le 1$ follows from the definition of $\ell_2$ norm of matrix. And using the unitary invariance of the norm $||\cdot||_2$ again. We can consider $Y=\mathbf{diag}(y_1, \dots, y_p)$. $$\sup_{\|Z\|_2\leq 1} \mathrm{tr}(X^T Z) = \sup_{|y_i| \leq 1} \sum_{i=1}^p \sigma_i y_i = \sum_{i=1}^p \sigma_i$$

Equivalence of computing trace norm of matrix

1 Answers1

Linked