1

$\mathbf{X}$ is a $n\times m$ matrix and $$ f(\mathbf{X}) = \min\limits_{X = UV} \max_i \mathbf{\|U_i\|} \max_j \mathbf{\|V_j\|} \;\; \text{(max over the rows) and } \|\| \text{is the $l_2$ norm} $$

If $\mathbf{P\Sigma Q'}$ is the SVD of $\mathbf{X}$, then we can consider $\mathbf{U} = \mathbf{P\Sigma^{1/2}}$ and $\mathbf{V} = \mathbf{Q\Sigma^{1/2}})$.

How can I show that $\mathbf{X}_{tr} \leq \sqrt{nm} f(\mathbf{X})$ where $\mathbf{X}_{tr}$ is the trace norm of $\mathbf{X}$ (sum of singular values)

What I did so far,

Trace norm can be written in matrix form as $\mathbf{X}_{tr} = \mathbf{Tr}(\Sigma\mathbf{C}), \mathbf{C} \text{ is the } m \times n \text{ matrix of ones}$, and using Cauchy-Schwarz inequality,

$ \mathbf{Tr^2(\Sigma C}) \leq \mathbf{Tr(\Sigma'\Sigma)}\mathbf{Tr(C'C)} = nm\mathbf{Tr(\Sigma'\Sigma)}$

Now, I am stuck at the point $\mathbf{Tr(\Sigma'\Sigma)}$, which is the sum of the eigenvalues of $\mathbf{X'X}$. How can I show that $\mathbf{Tr(\Sigma'\Sigma)} \leq f(\mathbf{X})$.

mathworker21
  • 34,399
Shew
  • 1,532
  • Your formulation of the problem and notations are so horrible that nobody will understand what you are asking. – Hans Apr 16 '18 at 01:03
  • @Hans I edited the question. Does it makes sense now ? – Shew Apr 16 '18 at 06:32
  • No. $f(X)$ does not make sense, because $UV′$ is but one of infinitely many decompositions of X, so your definition of $f(X)$ is not a function of $X$. You'd better not write it this way. I have answered your question below. – Hans Apr 17 '18 at 01:27
  • @Hans Does the notation make sense now ?. – Shew Apr 17 '18 at 06:14
  • Now it does, but only in the question formulation proper. Do you now know why your linked question is put on hold? Your current description is still erroneous. e.g. $f(\mathbf{X}) = \langle\sigma, \mathbf{P_i}\rangle \langle\sigma,\mathbf{Q_j}\rangle$ does not make sense. How can it depend on $(i, j)$? Your thoughts appear chaotic and your presentation drives people mad. :-D May I ask whether you have had any formal training in mathematics? – Hans Apr 17 '18 at 07:20
  • I used $P_i$ to denote the row vector corresponding to the max. I took some math courses while doing bachelors. that's all. – Shew Apr 17 '18 at 07:22
  • Then your statement $\mathbf{P_i,Q_j}$ are the row vectors of $\mathbf{P}$ and $\mathbf{Q}$ that maximize $f(\mathbf{X})$ is wrong, because there is nothing to maximize or minimize as $f(X)$ is already the result of extremization. Also where do you get the assertion "It turns out that $f(\mathbf{X}) = \langle\sigma, \mathbf{P_i}\rangle \langle\sigma,\mathbf{Q_j}\rangle$"? With regards to your answer to your math training, it shows. It helps to think through every phrase, every sentence you type to see if it makes sense. Yours really drive people nuts, you know. Not trying to be mean. :-P – Hans Apr 17 '18 at 08:09
  • @Hans. Yes, it was a wrong assertion. I removed it. Yes, my maths basics are not very strong. I am trying to improve it. I am taking some courses. BTW, can you tell me what is wrong in the linked question ? – Shew Apr 17 '18 at 08:52

1 Answers1

3

Applying the Cauchy-Schwartz inequality, we have for any real matrices $C$ and $D$ \begin{align} \text{tr}(CD)&=\sum_{ij}C_{ij}D_{ji} \\ &\le\Big(\sum_{ij}C_{ij}^2\Big)^\frac12\Big(\sum_{ij}D_{ij}^2\Big)^\frac12 \\ &=\big(\text{tr}(C^TC)\big)^\frac12\big(\text{tr}(D^TD)\big)^\frac12=\|C\|\|D\|. \end{align}

Now perform the singular value decomposition of $UV^T=ASB^T$ where $S$ is the diagonal matrix of the singular values of $UV^T$ and $A$ and $B$ are the associated orthogonal matrices. Apply the above proposition, we have \begin{align} \|UV^T\|_\text{tr} &= \text{tr}(S) = \text{tr}(A^TUV^TB) \\ &\le \|A^TU\|\|V^TB\|=\|U\|\|V\| \\ &\le \sqrt{n\,\max_i(U_iU_i^T)}\sqrt{m\,\max_j(V_jV_j^T)}=\sqrt{nm}\max_i\|U_i\|\max_j\|V_j\|. \end{align}

Hans
  • 9,804