Maximize trace over Stiefel manifold

Question

_{This question is the same as the question in this post. The OP of that post changed what they were asking and reduced it to a special case, so I’m asking the question in full generality here.}

Given symmetric $A \in \mathbb{R}^{m\times{m}}$, solve the optimization problem in $X \in \mathbb{R}^{m\times{n}}$

$$\begin{array}{rl} \max&\mathrm{Tr}(X^TAX)\\ \text{s.t.}&X^TX=I \end{array}$$

Now, if $X$ is square, then the objective function satisfies $$ \mathrm{Tr}(X^TAX)=\mathrm{Tr}(AXX^T)=\mathrm{Tr}(A) $$ for any orthogonal $X$. So we are interested in the case when $X \in \mathbb{R}^{m\times{n}}$ is tall (more rows than columns).

Attempt: Let $A=VDV^T$ denote the eigendecomposition of $A$. Then the objective function satisfies: $$ \mathrm{Tr}(X^TAX)=\mathrm{Tr}(X^TVDV^TX)=\mathrm{Tr}(DV^TXX^TV)=\langle{D,V^TXX^TV}\rangle. $$ If $D$ has non-negative entries (i.e. $A$ is positive semidefinite), I believe (but I’m not certain) that this expression is maximized when $V^TXX^TV=I$. However, this can never happen, since the outer product of two tall matrices can’t equal the identity. My guess is that, in the positive semidefinite case, you might pick eigenvectors of the $n$ largest eigenvalues of $A$.

Does this problem have a nice solution in general?

Something that confuses me a bit: is there a difference between the question in your post and the question in post that you have linked? — Ben Grossmann, Apr 12 '19 at 04:26
@Omnomnomnom No, it’s the same question. In the linked post, the OP sort of changed what they were asking and reduced it to a different problem, so I asked the full question here. They can be changed/merged as needed. In the mean time, I will edit my attribution to make this clearer. — David M., Apr 12 '19 at 11:47

Ben Grossmann · Answer 1 · 2019-04-17T01:00:38.997

3

Yes, there is indeed a nice solution in general. We find that $$ \begin{array}{rl} \max&\mathrm{Tr}(X^TAX)\\ \text{s.t.}&X \in \Bbb R^{m \times n}, \ X^TX=I \end{array} $$ is given by $\sum_{i=1}^m \lambda_i$ where $\lambda_1 \geq \cdots \geq \lambda_n$ are the eigenvalues of $A$ (which coincides with your suspicion about the positive semidefinite case).

One can prove that $\mathrm{Tr}(X^TAX) \leq \sum_{i=1}^m \lambda_i$ using the Schur-Horn theorem, for instance. In order to see that this upper-bound is attained, take the columns of $X$ to be the eigenvectors corresponding to the largest eigenvalues.

edited Apr 17 '19 at 01:00

answered Apr 12 '19 at 04:12

Ben Grossmann

225,327

Let me know if you'd like a detailed proof. I suspect, however, that you'll be able to fill in the details given my hint. If I remember correctly, this fact is presented as an exercise in Bhatia's Matrix Analysis. An important related notion is that of the Ky-Fan norm, which can be characterized as $$ |A|m = \max {\operatorname{Tr}(X^T ,\sqrt{A^TA} ,X) : X \in \Bbb R^{m \times n}, X^TX = I} = \sum{i=1}^m \sigma_i(A) $$ – Ben Grossmann Apr 12 '19 at 04:17
+1 Thanks for Ky-Fan norm note. I will work through the details then return to this post! If you would like to move your answer to linked post in my question, I can delete this post (don’t want to intentionally create duplicates). – David M. Apr 12 '19 at 11:55
I wouldn’t worry about duplicates, your post is fine. – Ben Grossmann Apr 12 '19 at 12:00

Maximize trace over Stiefel manifold

1 Answers1

Linked