2

I would like to know the convexity of the function below.

Denote the space of $ n \times n $ real symmetric matrices as $\mathcal{S}^n$, then define $f:\mathcal{S}^n \rightarrow \mathbb{R}$ as $$ f(X) = \left\| EF \exp(X) FE \right\|, $$ where $E,F$ are real symmetric positive definite matrices, $\| \cdot \|$ is the standard F norm, and $\exp$ is the matrix exponential. So, is $f$ convex?

It is trivial in the case of $\mathbb{R}$. But I don't have any clue for the general case. The zero-, first-, and second- order conditions are hard to check. However, I run 100k times on matlab with random inputs $X$ and parameters $E,F$. All hessian of $f$ w.r.t $X$ end up positive definite.

Edit

  1. In my initial question, I view $A=EF$ as an arbitrary invertible matrix. Thanks for the counter examples by @user1551. I realized that my way of thinking is wrong.
gsoldier
  • 175

2 Answers2

2

No. Here is a random counterexample: take $E=I$, $$ X=\pmatrix{0.044005&0\\ 0&0.867487}, \ Y=\pmatrix{0.7831&-0.3549\\ -0.3549&1.0717}, \ F=\pmatrix{5.4257&6.8409\\ 6.8409&11.5743}. $$ Numerically we have $f(X)+f(Y)-2f(\tfrac{X+Y}{2})=-0.2355<0$. Therefore $f$ is not mid-point convex. It turn, it is not convex. Note that the $X,Y,E$ and $F$ above are all positive definite. If you perturb $E=I$ in the above by a sufficiently small symmetric matrix, you may obtain a counterexample in which $X,Y,E,F$ are mutually non-commuting positive definite matrices and $EF$ is non-symmetric and non-triangular.

Intuitively, since matrix exponential is not operator convex (except in the scalar case), there exist symmetric matrices $X$ and $Y$ such that $S=\exp(X)+\exp(Y)-\exp(\tfrac{X+Y}{2})$ is indefinite. It follows that if $v$ is an eigenvector corresponding to the most negative eigenvalue of $S$ and $F$ is a symmetric matrix that scales a vector in the direction of $v$ by a sufficiently large factor, $f(X)+f(Y)-2f(\tfrac{X+Y}{2})$ will be negative when $E=I$. So, to generate a random counterexample, you should generate a matrix pair of $\{X,Y\}$ such that $D$ has a negative eigenvalue first, and then search for a matrix $F$ such that $f(X)+f(Y)-2f(\tfrac{X+Y}{2})<0$. Moreover, a sample size of 1000 is too small. In my experience, for numerical experiments concerning matrix function inequalities, the reasonable sample sizes are usually between 10,000 and 100,000.

user1551
  • 139,064
  • Thank you so much. I note that when $A$ is symmetric, it can always be diagonalized. So we can construct $A$ in your way. What if $A$ is invertible but not symmetric? Do we still have counter examples? I have re-run 10k times on matlab for randomized non-symmetric $A \in GL(n)$. All the hessian matrix of $f$ are positive definite. – gsoldier Apr 15 '23 at 09:42
  • 1
    @gsoldier One may obtain a counterexample with a non-symmetric $A$ by replacing any positive definite $A$ in a counterexample by the Cholesky factor of $A^TA$. I am not sure how you generated your samples, but your difficulty to obtain a counterexample suggests that there may be something wrong with your computer code. – user1551 Apr 15 '23 at 11:30
  • Thanks for further clarification. It's totally my fault. My initial function is $f(X)=\left|EF \exp (X) FE\right|$, where $E,F$ are two real symmetric positive definite matrices. My code follows this formulation and I run 100k times and all hessians end up positive definite. I made some simplification in my initial question, by viewing $A=EF \in GL(n)$. Thanks for your examples, I realized that $A=EF$ should not be viewed merely as an invertible matrix, as it is non-symmetric and non lower triangular. In this case, is $f$ convex or can we still have counter examples? Thank you so much!!! – gsoldier Apr 15 '23 at 13:30
  • 1
    @gsoldier The answer is still negative. You may simply take $E=A$ and $F=I$ in my counterexample. You may also further perturb $E$ and $F$ by a small amount to make them non-symmetric and non-triangular. – user1551 Apr 15 '23 at 15:42
0

May be it good to recall that the differential of $X\mapsto e^X$ is the linear map $H\mapsto \int_0^1e^{tX}He^{(1-t)X}dt.$

  • Do you mean to check the convexity of $f$ by first-order condition? ($f(y) \geqslant f(x)+\nabla f(x)^T(y-x)$ for all $x,y$) I don't know how to prove this inequalities as the differential of $f$ is pretty complex. – gsoldier Apr 15 '23 at 09:49