18

Hi I would like to know whether the trace of the inverse of a symmetric positive definite matrix $\mathrm{trace}(S^{-1})$ is convex.

Actually I know that the trace of a symmetric positive definite matrix $S\in M_{m,m}$ is convex since we can find $B\in M_{n,m}$ such that $S=B^T\times B$ then we can write the trace as the sum of scalar quadratic forms, i.e. $\mathrm{trace}(S)=\mathrm{trace}(B^T\times B)=\sum_{j=1}^mb_j^T\times b_j$ where $b_j$ is the $j^{th}$ column of $B$.

for instance if we have $trace([\begin{array}{cc} 1 & 2 \\ 3 & 4 \\ \end{array}] \times [\begin{array}{cc} 1 & 3 \\ 2 & 4 \\ \end{array}])= [\begin{array}{cc} 1 & 2 \\ \end{array}]\times [\begin{array}{c} 1 \\ 2 \\ \end{array}]+ [\begin{array}{cc} 3 & 4 \\ \end{array}]\times [\begin{array}{c} 3 \\ 4 \\ \end{array}]=30$

And so I wonder if $\mathrm{trace}(S^{-1})$ is convex too..

user2987
  • 761
  • So you are asking if the function $S\longmapsto \mbox{trace};S^{-1}$ is convex on symmetric positive definite matrices? – Julien Feb 08 '13 at 00:48
  • yes I would like to know if that function is convex – user2987 Feb 08 '13 at 00:51
  • 2
    The fact that the trace of the matrix itself is convex is obvious, because the trace is linear. That stuff about $B^T B$ is irrelevant (and wrong, since you want to look at a function of $S$, not of $B$). – Robert Israel Feb 08 '13 at 01:12
  • yes I want to look at the function itself but remember that a function can be convex just in a specific interval. And in this case I guess that this interval is the symmetric positive definite matrices.. – user2987 Feb 09 '13 at 22:24

2 Answers2

32

Yes, it is. Consider $S(t) = A + t B$ where $A$ is symmetric positive definite and $B$ is symmetric. It is enough to show that $$\left.\dfrac{d^2}{d t^2} \text{Tr}(S(t)^{-1})\right|_{t=0} \ge 0$$ Now $$ S(t)^{-1} = (A (I + t A^{-1} B))^{-1} = A^{-1} - t A^{-1} B A^{-1} + t^2 A^{-1} B A^{-1} B A^{-1} + \ldots$$ so $$ \left. \dfrac{d^2}{\partial t^2} \text{Tr}(S(t)^{-1}) \right|_{t=0} = 2 \text{Tr}(A^{-1} B A^{-1} B A^{-1})$$ But $A^{-1} B A^{-1} B A^{-1} = C A^{-1} C^T$ where $C = A^{-1} B$ and $A^{-1}$ is positive definite, so $C A^{-1} C^T$ is positive semidefinite, and therefore $\text{Tr}(CA^{-1} C^T) \ge 0$.

Robert Israel
  • 448,999
  • So can we found the scalar quadratic form for $\mathrm{trace}(S^{-1})$ like I did for $\mathrm{trace}(S)$? – user2987 Feb 11 '13 at 21:17
  • What scalar quadratic form? – Robert Israel Feb 11 '13 at 21:21
  • like the one that I have written for $\mathrm{trace}(S)=\mathrm{trace}(B^T\times B)=\sum_{j=1}^mb_j^T\times b_j$ where $b_j$ is the $j^{th}$ column of $B$. $b_j^T\times b_j$ is scalar quadratic form for $Q=I_{m\times m}$. – user2987 Feb 12 '13 at 00:55
  • This is because S is symmetric positive definite matrix then it must exist $B$ $s.t$ $S=B^TB$ – user2987 Feb 12 '13 at 00:56
  • I just find out that what I have written is senseless since we cannot find a quadratic form for $f\circ g:B\longrightarrow \mathrm{trace}((B^TB)^{-1})$ which is not convex where $f: S\longrightarrow \mathrm{trace}(S^{-1})$ and $g: B\longrightarrow B^TB$.

    Thanks!

    – user2987 Feb 12 '13 at 01:25
  • 1
    There's smth I don't get in the proof: why are you checking positivity of $\nabla^2Tr(S(t)^{-1})$ for $t = 0$ only ? Shouldn't it hold $\forall t$ for the result to hold ? –  Nov 18 '13 at 17:52
  • 7
    A twice differentiable function $f$ on an open subset $U$ of a vector space $V$ is convex iff $$ \left.\dfrac{d^2}{dt^2} f(u + tv)\right|_{t=0} \ge 0$$ for all $u \in U$, $v \in V$. The point is that other values of $t$ (for which $u+tv \in U$) correspond to other choices for $u$ at $t=0$. – Robert Israel Nov 19 '13 at 04:31
  • Can every symmetric matrix be written as $S = A+tB$? – good2know Nov 21 '16 at 09:56
  • Trivially, yes. – Robert Israel Nov 21 '16 at 16:58
  • For others who are confused, the inverse of (A + tB) is expanded using a special case of the Woodbury matrix identity for inverse of a sum. The "..." in this answer refers to the remaining terms of the recursive expansion. When t = 0, these terms are 0. Perhaps this is obvious to some people, but I wasn't aware the identity even existed until I did some digging. – Nerdizzle Oct 10 '21 at 23:34
  • Why is the second derivative evaluated only at 0 sufficient for convexity? I still don't understand. I don't buy the "other choices for u" argument. – Pavel Komarov Jan 24 '24 at 02:34
  • @PavelKomarov Why don't you buy it? – Robert Israel Jan 24 '24 at 02:46
  • Give a source with a proof. – Pavel Komarov Jan 24 '24 at 02:48
4

We can prove that $Tr\{X^{-1}\}$ is convex function by considering an arbitrary line, $X = Z + tV$, where $V \in S^n$ and $Z \in S^n_{++}$. Let $g(t) = f(Z+tV)$ for all $t \in \mathbb R$ such that $Z+tV \in S^n_{++}$. If (for all $X \in S^n_{++}$ and all $V \in S_n$) $g$ is convex then $f$ is convex. Consider \begin{align} g(t) &= Tr\{X^{-1}\} = Tr\{(Z+tV)^{-1}\}\\ & = Tr\{Z^{-1/2}(I+tZ^{-1/2}VZ^{-1/2})^{-1})Z^{-1/2}\} \\ & = Tr\{Z^{-1}(I+tZ^{-1/2}VZ^{-1/2})^{-1})\} \qquad \qquad \because \; Tr\{BA\} = Tr\{AB\} \\ \end{align} Since, $(Z^{-1/2}VZ^{-1/2})^T = Z^{-1/2}VZ^{-1/2}$ as $Z,V \in S^n$. We can write, $ Z^{-1/2}VZ^{-1/2} = Q\Sigma Q^T$, where $Q$ is orthogonal matrix and $\Sigma$ is a diagonal matrix with eigen values $\lambda_i$'s $ Z^{-1/2}VZ^{-1/2}$ of as diagonal entries. Hence,

\begin{align} g(t) & = Tr\{Z^{-1}(I+tQ\Sigma Q^T)^{-1}\} \quad \because \; Tr\{BA\} = Tr\{AB\} \\ & = Tr\{Z^{-1}(Q(I+t\Sigma) Q^{T})^{-1}\} \\ & = Tr\{Z^{-1}Q(I+t\Sigma)^{-1}Q^{T}\} \quad \because \; (ABC)^{-1} = C^{-1}B^{-1}A^{-1} \ and \ Q^{-1} = Q^T \\ & = Tr\{Q^{T}Z^{-1}Q(I+t\Sigma)^{-1}\} \\ & = \sum_{i=1}^n (Q^TZ^{-1}Q)_{ii} \frac{1}{1+t\lambda_i}\\ \implies g''(t) &= \sum_{i=1}^n (Q^TZ^{-1}Q)_{ii} \frac{2\lambda_i^2}{(1+t\lambda_i)^3} \end{align} As, $(1+t\lambda_i)>0$ and $Q^TZ^{-1}Q$ is symmetric and positive definite matrix ($\because Z^{-1} \in S^n_{++} $) resulting in the diagonal entries of $Q^TZ^{-1}Q$ being positive values. Hence $g''(t)>0$ is always true. This concludes that $f$ is convex.

  • You went a bit too quickly when assuming $\lambda_i> 0$. Indeed, the $\lambda_i$ are the eigenvalues of the matrix $A=Z^{-1/2}VZ^{-1/2}$, which is symmetric because $Z^{-1/2}$ and $V$ are, but which is not necessarily positive definite because condition $Z+tV\succ 0$ does not necessarily require $V\succ 0$.However, once can easily show that $I+tA$ is positive definite, which implies $1+t\lambda_i>0$. We can then conclude by noticing that the function $g_i:t\mapsto \frac{1}{1+t\lambda_i}$ is convex on the interval ${t \in \mathbb R : 1+t\lambda_i>0}$. – DodoDuQuercy Dec 03 '21 at 15:26