4

We have the following optimization problem in tall matrix $X \in\mathbb R^{n \times k}$

$$\begin{array}{ll} \text{minimize} & \mbox{tr}(X^T A X) + \lambda \,\mbox{tr}(X^T B)\\ \text{subject to} & X^T X = I_k\end{array}$$

where $A \in \mathbb R^{n \times n}$ is symmetric and positive semidefinite, $B \in \mathbb R^{n \times k}$ and $n>k$.

What is the solution? Is there a closed-form solution?

Royi
  • 8,711
E.J.
  • 939

1 Answers1

0

EDIT 1. We assume that $\lambda=1$ (change $B$ with $\lambda B$) and $A$ is only symmetric (the non-negativity has nothing to do here). $M_{n,k}$ denotes the real $n\times k$ matrices.

Let $f:X\in M_{n,k}\rightarrow tr(X^TAX)+ tr(X^TB)$. Since $Z=\{X;X^TX=I_k\}$ is compact, then the minimum of $f$ is reached in a point $X$ s.t. its derivative $Df_X(H)$ is $0$ for every $H$ in the tangent space $T_XZ$ of $Z$.

$\textbf{Proposition}.$ The $X$ that realize the minimum of $f$ are among the finite (generically) set of the solutions of the system (S):

$X^TX=I_k,X^TB=B^TX,(I_n-XX^T)(2AX+B)=0$.

$\textbf{Proof}$.

$Df_X:H\in T_XZ\rightarrow tr(2H^TAX+ H^TB)$.

Now $H\in T_XZ$ iff $H^TX=K$, a skew-symmetric matrix. Since $rank(X)=k$, $X^+=X^T$ and $H^T=KX^T+U(I_n-XX^T)$ where $U\in M_{k,n}$ is arbitrary.

Finally, the condition is: for every skew-symmetric $K$ and for every $U\in M_{k,n}$:

$tr((KX^T+U(I-XX^T))(2AX+ B))=0$. It suffices to consider the following 2 cases.

Case 1. $U=0$. then $X^T(2AX+ B)$ is symmetric, that is $X^TB$ is symmetric (generically, we obtain $k(k-1)/2$ relations).

Case 2. $K=0$. Then $(I-XX^T)(2AX+ B)=0$. $\square$

Of course, there is no closed-form solution.

EDIT 2. ** In order to gain 70% in computing time, we can diagonalize $A$.

Inded $A=PDP^T$ where $P\in O(n)$ and $D$ is diagonal. We put $Y=P^TX$; note that $Y^TY=X^TPP^TX=I_k$. On the other hand, $f(X)=g(Y)=tr(Y^TDY+Y^TP^TB)$.

Thus we may assume that $A$ is diagonal (change $B$ with $P^TB$).

** An example. When $k=3,n=5$, the number of complex solutions $X$ of the system (S) is (in the generic case) $248$, that is (S) admits, on average, $\approx 16$ real solutions that remain to be tested.

EDIT 3. Answer to @Royi . I reduce the problem to solving the matricial algebraic system (S). I tested the method using the Grobner basis theory (under Maple); that works until $n=6$. For larger $n$, you (it's your business) must use numerical methods in order to obtain -at least locally- the solutions. About that, there is a problem:

We fix a large $n$. If $k=1$, then $(S)$ admits (generically) $2n$ complex solutions that is, in average, $\sqrt{2n}$ real solutions. Then, with a good software, we can find and test all these solutions.

Unfortunately, when $k$ increases , the number of solutions of $(S)$ increases dramatically until $k=n-1$; indeed, when $k=n-1$, the number of complex solutions is $2^{2n-2}$, that is, in average, $2^{n-1}$ real solutions, what is associated to an exponential complexity. Therefore, in a first step, we must localize a region (not too large) where the function $f$ reaches its minimum.