6

Given a symmetric positive definite matrix $S \in \mathbb R^{d\times d}$ and $\lambda > 0$, I would like to find

$$X^\star := \underset{{X\in\mathbb R^{d\times d}}}{\operatorname{argmin}} \operatorname{tr}\left(X^{-T}SX^{-1}\right) + \lambda \|X\|_1.$$

where

$$\|X\|_1 := \sum_{i=1}^d\sum_{j=1}^d\left\vert X_{ij}\right\vert$$

Has anyone seen this kind of objective function? In particular, it has proven to be quite tricky as it seems to be locally convex.

  • 3
    There are ways to solve such problems using relaxation methods and semidefinite programming. The Cone Complementary Algorithm is one of them. – KBS May 08 '22 at 22:43
  • Thanks for your reply, KBS. Not familiar with these methods and haven't found a reference including the regularization. Do you happen to have any further details? – foreignvol May 09 '22 at 20:01
  • 2
    Have you considered introducing the matrix variable $Y$ and the equality constraint $XY=I$? – Rodrigo de Azevedo May 09 '22 at 20:08
  • Introducing such a matrix is an interesting point. I've considered introducing it and then using a coordinated descent but wouldn't the constraint completely determine $Y$? – foreignvol May 09 '22 at 21:09
  • @ForeignVolatility To answer this question, ask yourself whether the inverse of an invertible matrix is unique. – KBS May 09 '22 at 22:51
  • I'm a bit confused. We know that the inverse of an invertible matrix is unique. So how could we proceed from here? – foreignvol May 12 '22 at 23:44
  • @ForeignVolatility I have my suspicions, but... in what context does this optimisation problem arise? – Jose Avilez May 18 '22 at 14:15
  • Is there any restriction on the matrix $\mathbf{X}$, for example does it have to be symmetric? – The Pheromone Kid May 19 '22 at 12:05

2 Answers2

3

Your problem is not convex, so you will need some relaxation or iterative schemes to solve it (locally). Another issue is that $X$ is not symmetric, so many existing schemes will not apply. So, the first step would be to remove the annoying terms in the cost to optimize. To do so, we define

$$M=X^{-T}SX^{-1},$$

which can be non-conservatively relaxed into $M\succeq X^{-T}SX^{-1}$, as it is a minimization problem. Now, we get that this equivalent to

$$\begin{bmatrix}M & X^{-T}\\X^{-1} & S^{-1}\end{bmatrix}\succeq0.$$

Now let $Y=X^{-1}$ to yield the optimization problem

$$\begin{cases}\underset{(X,Y)}{\operatorname{min}} &\operatorname{tr}\left(M\right) + \lambda ||X||_1\\ \text{s.t.}&XY=I,\\& \begin{bmatrix}M & Y^{T}\\Y & S^{-1}\end{bmatrix}\succeq0 \end{cases}.$$

Now the difficulty is how to deal with the nonlinear constraint $XY=I$. One way is to consider an iterative algorithm where we update the values of $X$ and $Y$ according some perturbations $\delta X$ and $\delta Y$ as

  1. Pick $X_0,Y_0$ such that $X_0Y_0=I$ and let $i=0$.
  2. Then solve the optimization problem $$\begin{array}{rcrl} (\delta X_i,\delta Y_i)&=&\operatorname{argmin}_{\delta_X,\delta_Y} & \operatorname{tr}\left(M\right) + \lambda ||X_i+\delta_X||_1\\ % &&\text{s.t.} & X_i\delta_Y+\delta_X Y_i=0,\\ &&& \begin{bmatrix}M & Y^{T}+\delta_Y^T\\Y+\delta_Y & S^{-1}\end{bmatrix}\succeq0 \end{array}$$
  3. Let $X_{i+1}=X_i+\delta X_i$ and $Y_{i+1}=Y_i+\delta Y_i$
  4. Evaluate $X_{i+1}Y_{i+1}$ and correct the values if necessary.
  5. Let $i=1+1$, and go back to step 2.

A stopping criterion can be implemented by stopping when the cost does not decrease anymore. Additional constraints may also be added on the norm of $\delta_X$ and $\delta_Y$ to limit the size and steps and avoid deviating too much from the manifold $XY=I$.

The 1-norm term can be removed from the cost using lifting variables as done in José C Ferreira's answer.

The convergence can only be ensured locally, so you may need to consider restarting the algorithm.

KBS
  • 7,114
2

Given a symmetric positive definite matrix $S \in \mathbb R^{d\times d}$ and $\lambda > 0$.

If you like to find

$$X^\star := \underset{{X\in\mathbb R^{d\times d}}}{\operatorname{argmin}} \operatorname{tr}\left(X^{-T}SX^{-1}\right) + \lambda \|X\|_1.$$ where $$\|X\|_1 := \sum_{i=1}^d\sum_{j=1}^d\left\vert X_{ij}\right\vert.$$

You can rewrite this problem as $$\begin{cases}\underset{(Y,t)}{\operatorname{argmin}} &\operatorname{tr}\left(YY^{T}S\right) + \lambda \sum_{i=1}^d\sum_{j=1}^d t_{ij}\\ \text{subject to}&XY=I\\&t_{ij}-X_{ij}\geq 0\\&t_{ij}+X_{ij}\geq 0\end{cases}.$$

You can find related discussions and numerical methods suggestions in Counterexample for a convex problem and operator norm minimization problem, for instance.

Perhaps this discussion helps you with Lagrange Multipliers and Constrained Optimization

You can use numerical R language function, and find more searching for "\(\min_Y tr(Y^TYS)\)" on SearchOnMath, for instance.