Is function $X \mapsto \mbox{Tr}\left(X X^T X X^T\right)$ convex?

Question

Is $\mbox{Tr}\left(X X^T X X^T\right)$ a convex function of arbitrary real matrix $X$?
More generally, is $\mbox{Tr}\left(\left(X X^{\dagger}\right)^m\right)$ a convex function of arbitrary complex matrix $X$ for any integer $m \ge 1$?

Any advice or suggestions would be greatly appreciated.

The proof hint:

Let us apply SVD to a matrix $X$: $X$ = $U D V^{\dagger}$. Every matrix has SVD with non-negative singular values on the main diagonal of $D$. Next:

$\left(X X^{\dagger}\right)^m$ = $U D V^{\dagger} V D U^{\dagger} U D V^{\dagger} V D U^{\dagger} \ldots U D V^{\dagger} V D U^{\dagger}$ = $U D^{2 m} U^{\dagger}$.

Note, all unitary matrices are cancelled in between, because $U^{\dagger}U = V^{\dagger}V = I$.

$\mbox{Tr} \left(\left(X X^{\dagger}\right)^m\right)$ = $\mbox{Tr} \left(U D^{2 m} U^{\dagger}\right)$ = $\mbox{Tr} \left(D^{2 m} U^{\dagger} U\right)$ = $\mbox{Tr} \left(D^{2 m}\right)$ = $\sum_i \sigma_i^{2 m}$, where we used the cyclic property of trace operation, and $\{\sigma_i\}$ are the singular values of matrix $X$.

Let $\{x_i\}$, $\{y_i\}$ and $\{z_i\}$ be the singular values of arbitrary matrices $X$, $Y$ and their convex combination $Z$ = $\alpha X + (1 - \alpha) Y$ respectively. It was shown below by @PSL that for the Frobenius norm ($m = 1$) the following holds true:

$\alpha \sum_i x_i^2 + (1 - \alpha) \sum_i y_i^2 \ge \sum_i z_i^2$.

Considering that the function $\phi: x \rightarrow x^m, x \in R^+$ is convex, would it be possible to show that the case $m > 1$ is also satisfied:

$\alpha \sum_i x_i^{2 m} + (1 - \alpha) \sum_i y_i^{2 m} \ge \sum_i z_i^{2 m}$ ?

Update: by numerical simulation I found that $\sum_i x_i^{2} \ge \sum_i z_i^{2}$ does not necessarily entails $\sum_i x_i^{4} \ge \sum_i z_i^{4}$ on roughly 9% of random configurations. Seems like this line of thoughts does not work. However, the extended brute force simulation still succeeds for $m$ 2 to 5.

Brute force approach to answer the questions. Here I literally check convexity on random matrices. The Python code speaks for itself:

import numpy as np
tol = 10.0 * np.finfo(float).eps
count_ok, count_fail = int(0), int(0)
for m in range(2, 5 + 1):
    print("m:", m)
    for dim in range(2, 10 + 1):
        print(f"matrix size: {dim}x{dim}")
        for test in range(100000):
            X = 2 * np.random.rand(dim, dim) - 1
            Y = 2 * np.random.rand(dim, dim) - 1
            XXt = X @ X.T
            YYt = Y @ Y.T
            for t in np.linspace(0.01, 0.99, 20):
                Z = X * t + Y * (1 - t)
                ZZt = Z @ Z.T
                ok = (np.trace(np.linalg.matrix_power(ZZt, m)) <=
                      np.trace(np.linalg.matrix_power(XXt, m)) * t +
                      np.trace(np.linalg.matrix_power(YYt, m)) * (1 - t) + tol)
                if ok:
                    count_ok += 1
                else:
                    count_fail += 1
print(f"succeeded: {count_ok} times")
print(f"failed: {count_fail} times")
print("")
.......
succeeded: 72,000,000 times
failed: 0 times

The question of @Arctic Char is crystal clear : Have you made some previous Web searching, have you made some computational attempts, for example computations entrywise in dimension 2 or 3 ? etc. — Jean Marie, Nov 14 '21 at 22:02
What I tried beforehand was quite naive. For now, I realised which direction to move on, thanks to @Bananach. — Albert65, Nov 14 '21 at 22:35
Brute force simulation with random matrices of size 2x2 to 10x10 favours the "yes" answer to the first question (1.800.000 trials have been made in total). — Albert65, Nov 14 '21 at 23:57
@Albert65 Could you say more about this simulation in an answer that I will surely upvote ? This kind of approach is alas still too rare among "pure mathematicians" and it would help the OP. — Jean Marie, Nov 15 '21 at 08:46
@JeanMarie my brute force "solution" is not a proof by no means, rather a hint that the answer might be "yes". — Albert65, Nov 15 '21 at 16:11
The SVD idea is nice, but I find it a bit problematic to use the singular values as the variables, when you're interested in showing the convexity in the variable $X$. I might be missing something obvious, but would still be a bit careful there. — PSL, Nov 15 '21 at 18:16

score 3 · Answer 1 · answered Nov 15 '21 at 08:09

Your problem reduces to showing that $(XX^T)^m$ is convex in the matrix sense. Recall that a matrix valued function $f$ is convex, if $\alpha f(X) + (1-\alpha) f(Y) \succeq f(\alpha X + (1-\alpha)Y)$.

This is because $A \succeq B$ implies $\textrm{Tr}(A) \geq \textrm{Tr}(B)$.

For $m = 1$ this is not too difficult:

Claim 1. $AA^T\succeq 0$.

Proof. $v^TAA^Tv = \|Av\|^2_2 \geq 0$ for every $v$.

Claim 2. $f(X) = XX^T$ is convex in the matrix sense.

Proof. Compute $$ \alpha f(X) + (1-\alpha) f(Y) - f(\alpha X + (1-\alpha)Y) = \alpha(1-\alpha)XX^T + \alpha(1-\alpha)YY^T - \alpha(1-\alpha)(XY^T + YX^T). $$ We want to show that the above is $\succeq 0$, which happens if and only if $XX^T + YY^T - (XY^T + YX^T) \succeq 0$, as $\alpha \in (0,1)$. But this is just $(X - Y)(X - Y)^T$, which is positive semidefinite by Claim 1. Hence, $\alpha f(X) + (1-\alpha) f(Y) \succeq f(\alpha X + (1-\alpha)Y)$.

[1] Didactic proof ! See here for other approaches. – Jean Marie Nov 15 '21 at 08:43 — Jean Marie, Nov 15 '21 at 08:43

score 1 · Answer 2 · answered Nov 14 '21 at 21:56

1

By https://en.m.wikipedia.org/wiki/Trace_inequality the function $A\mapsto \text{tr} f(A)$ is convex if $f\colon \mathbb{R}\to\mathbb{R}$ is.

answered Nov 14 '21 at 21:56

Bananach

7,934

1

But here $f$ isn't a function $\mathbb R \to \mathbb R$... – Jean Marie Nov 14 '21 at 22:04
@JeanMarie ah, if $X$ were symmetric it would be though, by spectral calculus. Not sure what's salvageable for asymmetric matrices – Bananach Nov 14 '21 at 22:14
@Bananach, thank you for suggestion. This is a really good starting point for me. – Albert65 Nov 14 '21 at 22:36

user1551 · Answer 3 · 2021-11-19T05:43:14.463

Yes, $X\mapsto\operatorname{tr}((XX^\ast)^m)$ is convex for any integer $m\ge1$.

The function is a composition of the inner function $h:M_n(\mathbb C)\ni X\mapsto A=XX^\ast\in\mathbb S_+$ (where $\mathbb S_+$ denotes the set of all positive semidefinite matrices) and the outer function $g:\mathbb S_+\ni A\mapsto\operatorname{tr}(A^m)\in\mathbb R$. The matrix-valued inner function $h$ is convex because $$ \theta h(X)+(1-\theta)h(Y)-h\left(\theta X+(1-\theta)Y\right) =\theta(1-\theta)(X-Y)(X-Y)^\ast $$ is positive semidefinite. The outer function $g$ is convex because it is in the form of $g(A)=\operatorname{tr}(f(A))$ (see footnote below), where $f(x)=x^m$ is a continuous convex function on the positive reals. Clearly $g(A)=\operatorname{tr}(A^m)$ is also (weakly) increasing on $\mathbb S_+$ (although $A\mapsto A^m$ is not increasing on $\mathbb S_+$ in general). Therefore $g\circ h$ is a composition of a convex inner function and a convex increasing outer function. Hence it is convex.

Footnote.

As pointed out by the Wikipedia article linked by Bananach's answer here, a proof of the result that $f$ is continuous and convex implies $A\mapsto\operatorname{tr} f(A)$ is convex can be found, for instance, in the 2009 paper Trace inequalities and quantum entropy: an introductory course by Eric Carlen. Although the domain and codomain of the $f$ in this paper are different from ours (reals vs positive reals and Hermitian vs PSD), the same proof applies.

Thank you very much @user1551. Very concise and clear explanation. I tried a similar way but was unable to show that $g(A)$ is increasing. Thanks for the reference. For anyone who is interested why $g(A)$ should be increasing see, for example, link this answer, where inequalities should be replaced by matrix definiteness relations. — Albert65, Nov 18 '21 at 22:41

Is function $X \mapsto \mbox{Tr}\left(X X^T X X^T\right)$ convex?

3 Answers3