Minimize trace of quadratic inverse + LASSO

Question

Given a symmetric positive definite matrix $S \in \mathbb R^{d\times d}$ and $\lambda > 0$, I would like to find

$$X^\star := \underset{{X\in\mathbb R^{d\times d}}}{\operatorname{argmin}} \operatorname{tr}\left(X^{-T}SX^{-1}\right) + \lambda \|X\|_1.$$

where

$$\|X\|_1 := \sum_{i=1}^d\sum_{j=1}^d\left\vert X_{ij}\right\vert$$

Has anyone seen this kind of objective function? In particular, it has proven to be quite tricky as it seems to be locally convex.

There are ways to solve such problems using relaxation methods and semidefinite programming. The Cone Complementary Algorithm is one of them. — KBS, May 08 '22 at 22:43
Thanks for your reply, KBS. Not familiar with these methods and haven't found a reference including the regularization. Do you happen to have any further details? — foreignvol, May 09 '22 at 20:01
Have you considered introducing the matrix variable $Y$ and the equality constraint $XY=I$? — Rodrigo de Azevedo, May 09 '22 at 20:08
Introducing such a matrix is an interesting point. I've considered introducing it and then using a coordinated descent but wouldn't the constraint completely determine $Y$? — foreignvol, May 09 '22 at 21:09
@ForeignVolatility To answer this question, ask yourself whether the inverse of an invertible matrix is unique. — KBS, May 09 '22 at 22:51
I'm a bit confused. We know that the inverse of an invertible matrix is unique. So how could we proceed from here? — foreignvol, May 12 '22 at 23:44
@ForeignVolatility I have my suspicions, but... in what context does this optimisation problem arise? — Jose Avilez, May 18 '22 at 14:15
Is there any restriction on the matrix $\mathbf{X}$, for example does it have to be symmetric? — The Pheromone Kid, May 19 '22 at 12:05

KBS · Answer 1 · 2022-05-22T13:36:58.880

Your problem is not convex, so you will need some relaxation or iterative schemes to solve it (locally). Another issue is that $X$ is not symmetric, so many existing schemes will not apply. So, the first step would be to remove the annoying terms in the cost to optimize. To do so, we define

$$M=X^{-T}SX^{-1},$$

which can be non-conservatively relaxed into $M\succeq X^{-T}SX^{-1}$, as it is a minimization problem. Now, we get that this equivalent to

$$\begin{bmatrix}M & X^{-T}\\X^{-1} & S^{-1}\end{bmatrix}\succeq0.$$

Now let $Y=X^{-1}$ to yield the optimization problem

$$\begin{cases}\underset{(X,Y)}{\operatorname{min}} &\operatorname{tr}\left(M\right) + \lambda ||X||_1\\ \text{s.t.}&XY=I,\\& \begin{bmatrix}M & Y^{T}\\Y & S^{-1}\end{bmatrix}\succeq0 \end{cases}.$$

Now the difficulty is how to deal with the nonlinear constraint $XY=I$. One way is to consider an iterative algorithm where we update the values of $X$ and $Y$ according some perturbations $\delta X$ and $\delta Y$ as

Pick $X_0,Y_0$ such that $X_0Y_0=I$ and let $i=0$.
Then solve the optimization problem $$\begin{array}{rcrl} (\delta X_i,\delta Y_i)&=&\operatorname{argmin}_{\delta_X,\delta_Y} & \operatorname{tr}\left(M\right) + \lambda ||X_i+\delta_X||_1\\ % &&\text{s.t.} & X_i\delta_Y+\delta_X Y_i=0,\\ &&& \begin{bmatrix}M & Y^{T}+\delta_Y^T\\Y+\delta_Y & S^{-1}\end{bmatrix}\succeq0 \end{array}$$
Let $X_{i+1}=X_i+\delta X_i$ and $Y_{i+1}=Y_i+\delta Y_i$
Evaluate $X_{i+1}Y_{i+1}$ and correct the values if necessary.
Let $i=1+1$, and go back to step 2.

A stopping criterion can be implemented by stopping when the cost does not decrease anymore. Additional constraints may also be added on the norm of $\delta_X$ and $\delta_Y$ to limit the size and steps and avoid deviating too much from the manifold $XY=I$.

The 1-norm term can be removed from the cost using lifting variables as done in José C Ferreira's answer.

The convergence can only be ensured locally, so you may need to consider restarting the algorithm.

If I may ask, why do you use || instead of \|? A matter of taste? — Rodrigo de Azevedo, May 18 '22 at 14:34

José C Ferreira · Answer 2 · 2022-05-18T12:17:11.347

Given a symmetric positive definite matrix $S \in \mathbb R^{d\times d}$ and $\lambda > 0$.

If you like to find

$$X^\star := \underset{{X\in\mathbb R^{d\times d}}}{\operatorname{argmin}} \operatorname{tr}\left(X^{-T}SX^{-1}\right) + \lambda \|X\|_1.$$ where $$\|X\|_1 := \sum_{i=1}^d\sum_{j=1}^d\left\vert X_{ij}\right\vert.$$

You can rewrite this problem as $$\begin{cases}\underset{(Y,t)}{\operatorname{argmin}} &\operatorname{tr}\left(YY^{T}S\right) + \lambda \sum_{i=1}^d\sum_{j=1}^d t_{ij}\\ \text{subject to}&XY=I\\&t_{ij}-X_{ij}\geq 0\\&t_{ij}+X_{ij}\geq 0\end{cases}.$$

You can find related discussions and numerical methods suggestions in Counterexample for a convex problem and operator norm minimization problem, for instance.

Perhaps this discussion helps you with Lagrange Multipliers and Constrained Optimization

You can use numerical R language function, and find more searching for "$\min_Y tr(Y^TYS)$" on SearchOnMath, for instance.

Thanks for the comment. – José C Ferreira May 18 '22 at 12:21 — José C Ferreira, May 18 '22 at 12:21

Minimize trace of quadratic inverse + LASSO

2 Answers2

Linked