Minimize $ \mbox{tr} ( X^T A X ) + \lambda \mbox{tr} ( X^T B ) $ subject to $ X^T X = I $ - Linear Matrix Function with Norm Equality Constraint

Question

We have the following optimization problem in tall matrix $X \in\mathbb R^{n \times k}$

$$\begin{array}{ll} \text{minimize} & \mbox{tr}(X^T A X) + \lambda \,\mbox{tr}(X^T B)\\ \text{subject to} & X^T X = I_k\end{array}$$

where $A \in \mathbb R^{n \times n}$ is symmetric and positive semidefinite, $B \in \mathbb R^{n \times k}$ and $n>k$.

What is the solution? Is there a closed-form solution?

Are the matrices $X$ real or complex? Are they of a particular size? — Ben Grossmann, Jun 08 '18 at 16:53
Notably, if the matrices $X$ are square, then the problem becomes trivial. — Ben Grossmann, Jun 08 '18 at 16:56
@Omnomnomnom X is not square. I have modified the question with specific dimension for each variable. — E.J., Jun 08 '18 at 16:57
Maybe related: https://math.stackexchange.com/q/2509243/398989 — David M., Jun 08 '18 at 19:31
Do you have a closed form solution at least when $X$ is a vector ($k=1$) and $A$ is psd like in your other question? — A.Γ., Jun 08 '18 at 19:54
@A.Γ. It has closed form solution for the relaxed problem. Actually, David M. points to the same problem which seems hard to solve. If we do not have the linear term in the obj, it is just the Ky Fan theorem. — E.J., Jun 08 '18 at 20:30
@E.J. The problem, David M. pointed at, has $A$ psd, while you have a general (indefinite) quadratic term. It makes it even harder I guess. — A.Γ., Jun 08 '18 at 21:01
@RodrigodeAzevedo Yes!!! That is important and I forget to mention. — E.J., Jun 09 '18 at 17:15
@Omnomnomnom, Is there a name to matrices which obey $ {X}^{T} X = I $? Is there a projection to such a set? Could it be rewritten as scalar constraint? — Royi, Jun 10 '18 at 20:02
@Royi They're usually called "isometries". I'm not sure what that projection would look like, but perhaps there's some trick via polar decomposition. — Ben Grossmann, Jun 10 '18 at 22:01
@RodrigodeAzevedo, how would vectorize the problem if the equality constraint was relaxed ($ {X}^{T} X = I $ would become $ {X}^{T} X \preceq I $? — Royi, Jun 16 '18 at 08:37
@Omnomnomnom, On Wikipedia they are called Semi Orthogonal Matrix. — Royi, Jun 16 '18 at 13:09

score 0 · Answer 1 · 2018-06-18T09:43:59.440

EDIT 1. We assume that $\lambda=1$ (change $B$ with $\lambda B$) and $A$ is only symmetric (the non-negativity has nothing to do here). $M_{n,k}$ denotes the real $n\times k$ matrices.

Let $f:X\in M_{n,k}\rightarrow tr(X^TAX)+ tr(X^TB)$. Since $Z=\{X;X^TX=I_k\}$ is compact, then the minimum of $f$ is reached in a point $X$ s.t. its derivative $Df_X(H)$ is $0$ for every $H$ in the tangent space $T_XZ$ of $Z$.

$\textbf{Proposition}.$ The $X$ that realize the minimum of $f$ are among the finite (generically) set of the solutions of the system (S):

$X^TX=I_k,X^TB=B^TX,(I_n-XX^T)(2AX+B)=0$.

$\textbf{Proof}$.

$Df_X:H\in T_XZ\rightarrow tr(2H^TAX+ H^TB)$.

Now $H\in T_XZ$ iff $H^TX=K$, a skew-symmetric matrix. Since $rank(X)=k$, $X^+=X^T$ and $H^T=KX^T+U(I_n-XX^T)$ where $U\in M_{k,n}$ is arbitrary.

Finally, the condition is: for every skew-symmetric $K$ and for every $U\in M_{k,n}$:

$tr((KX^T+U(I-XX^T))(2AX+ B))=0$. It suffices to consider the following 2 cases.

Case 1. $U=0$. then $X^T(2AX+ B)$ is symmetric, that is $X^TB$ is symmetric (generically, we obtain $k(k-1)/2$ relations).

Case 2. $K=0$. Then $(I-XX^T)(2AX+ B)=0$. $\square$

Of course, there is no closed-form solution.

EDIT 2. ** In order to gain 70% in computing time, we can diagonalize $A$.

Inded $A=PDP^T$ where $P\in O(n)$ and $D$ is diagonal. We put $Y=P^TX$; note that $Y^TY=X^TPP^TX=I_k$. On the other hand, $f(X)=g(Y)=tr(Y^TDY+Y^TP^TB)$.

Thus we may assume that $A$ is diagonal (change $B$ with $P^TB$).

** An example. When $k=3,n=5$, the number of complex solutions $X$ of the system (S) is (in the generic case) $248$, that is (S) admits, on average, $\approx 16$ real solutions that remain to be tested.

EDIT 3. Answer to @Royi . I reduce the problem to solving the matricial algebraic system (S). I tested the method using the Grobner basis theory (under Maple); that works until $n=6$. For larger $n$, you (it's your business) must use numerical methods in order to obtain -at least locally- the solutions. About that, there is a problem:

We fix a large $n$. If $k=1$, then $(S)$ admits (generically) $2n$ complex solutions that is, in average, $\sqrt{2n}$ real solutions. Then, with a good software, we can find and test all these solutions.

Unfortunately, when $k$ increases , the number of solutions of $(S)$ increases dramatically until $k=n-1$; indeed, when $k=n-1$, the number of complex solutions is $2^{2n-2}$, that is, in average, $2^{n-1}$ real solutions, what is associated to an exponential complexity. Therefore, in a first step, we must localize a region (not too large) where the function $f$ reaches its minimum.

Could you share your code? Also, What's $ {M}_{n, k} $? – Royi Jun 17 '18 at 20:17 — Royi, Jun 17 '18 at 20:17

Minimize $ \mbox{tr} ( X^T A X ) + \lambda \mbox{tr} ( X^T B ) $ subject to $ X^T X = I $ - Linear Matrix Function with Norm Equality Constraint

1 Answers1