7

It's well known in statistical mechanics that the following is a convex function of the vector $\theta$: $$ A(\theta) = \log \left( \sum_{i=1}^\infty e^{\theta \cdot f(i)} \right) $$ where $f(i)$ is a vector function of $i$. In the context of statistical mechanics, $A$ is known as the log partition function.

However, all the proofs of the convexity property that I know of rely on the interpretation of this function in terms of probability theory. One defines a probability distribution given by $p_i = e^{\theta\cdot f(i)-A(\theta)}$ and shows (for example) that the partial derivatives of $A$ form a covariance matrix, which implies that its Hessian must be positive definite.

For the sake of enhancing my understanding, I would like a more direct proof, one that proceeds directly from the mathematical form of $\psi$ as defined above, without considering a probability distribution. Is there a straightforward way to see that $A$ as defined above is a convex function of $\theta$, independently of its interpretation as a partition function in statistical mechanics?

N. Virgo
  • 7,182

1 Answers1

2

Before you add another downvote, note that showing that the function $h_p(x) = \log \sum_{k=1}^p e^{x_k}$ is convex for finite $p$ is straightforward using the Hessian and the Cauchy Schwarz Trump inequality, see the addendum below (or see https://math.stackexchange.com/a/1190438/27978, https://math.stackexchange.com/a/2089953/27978 for a few examples and https://math.stackexchange.com/a/2418721/27978 for a different proof), this answer addresses the $p \to \infty$ and inner product aspects.

Since you have an inner product, I am assuming that $\theta$ lies in some Hilbert space $\mathbb{H}$.

If the functions $g_k:\mathbb{H} \to \mathbb{R}$, are convex, and the function $h :\mathbb{R}^p \to \mathbb{R}$ is convex and non decreasing in each parameter, then it is straightforward to show that the function $\theta \mapsto h(g_1(\theta),...,g_p(\theta))$ is convex.

The 'log-sum-exponential' function $h_p(x) = \log \sum_{k=1}^p e^{x_k}$ is convex. It is straightforward to show that the Hessian is positive semi definite. It is easy to show that $h$ is non decreasing in each parameter $x_k$.

If we let $g_k(\theta) = \langle \theta, f_k \rangle$, we see that the function $\alpha_p(\theta) = h_p(g_1(\theta),...,g_p(\theta))$ is convex.

If the functions $\alpha_p$ are convex, and $\alpha(\theta) = \lim_{p \to \infty} \alpha_p(\theta)$, then the function $\alpha$ is easily shown to be convex.

Combining the above show that the function $A$ is convex.

Addendum: The function $A$ is not necessarily strictly convex. One can choose $f(i) = \phi \cdot \delta_{i0}$ for some fixed $\phi$ so that $A$ has the form $A(\theta) = \log(e^{\langle \theta, \phi \rangle} ) = \langle \theta, \phi \rangle$, which is not strictly convex.

Addendum for those downvoters and those who have lots of time to nitpick:

To prove that $h_p$ is convex, let $s(x) = \sum_{k=1}^p e^{x_k}$ and note that $h_p(x)'' = {1 \over s^2(x)} (s''(x)s(x)-s'(x)s'(x)^T)$. It is sufficient to show that $s(x) x^T s''(x)x \ge (x^Ts'(x))^2$ for all $x$, or equivalently, show $\sum_k e^{x_k} \sum_k x_k^2 e^{x_k} \ge (\sum_k x_k e^{x_k})^2$.

Cauchy Schwarz gives $(\sum_k x_k e^{x_k})^2 = (\sum_k x_k e^{x_k \over 2} e^{x_k \over 2} )^2 \le \sum_k x_k^2 e^{x_k} \sum_k e^{x_k}$

copper.hat
  • 172,524
  • 1
    "It is straightforward to show that the Hessian is positive semi definite" -- How? – confused00 Nov 04 '16 at 22:12
  • @confused00: Suppose $f$ is convex, then define $f_{x,h}(t) = f(x+th)$. Then since $f$ is convex, we have $f''{x,h}(0) \ge 0$. Since $f''{x,h}(0) = \langle h, {\partial^2 f(x) \over \partial x^2} h \rangle$, we have the desired result. – copper.hat Nov 04 '16 at 22:45
  • Why the downvote? – copper.hat Nov 04 '16 at 23:01
  • 1
    Wasn't me. Thanks for your answer! I'll give you an upvote to counteract the downvote because I found your answer useful. – confused00 Nov 05 '16 at 11:21
  • @confused00: Thanks :-). The points don't matter, but I am curious what bothered the downvoter. – copper.hat Nov 05 '16 at 16:35
  • 1
    @copper.hat The composition rule says; "$h$ is convex and nondecreasing, $g_i$ is convex" $\Rightarrow$ "$f(x)= h(g_1(x),g_2(x),\cdots,g_n(x))$ is convex".

    But here, the function $h(\bullet)$ is $\log$ function, which is not convex! So, how can we apply the composition rule here?

    – Anver Hisham Mar 23 '18 at 19:09
  • @AnverHisham: No, $h_p$ is the log of sum function which is convex. – copper.hat Mar 23 '18 at 19:57
  • Why another downvote? Something wrong after 5 years? – copper.hat Feb 21 '20 at 17:45
  • I really am curious, why the new downvote? Please elaborate. – copper.hat Feb 21 '20 at 19:27
  • 1
    Log of sum is not convex. Log of sum of exponential is convex though, but then you need to prove it using some other techniques. – Pew Oct 07 '21 at 23:39
  • Also you need to show that the Hessian is positive semi definite to show the log of sum of exponential is convex. But if you assume f is convex, then of course its hessian is convex, but then it doesn’t produce anything new. – Pew Oct 08 '21 at 00:12
  • @Pew I think you (and the downvoters) are missing the point of the OP's question. As I mentioned above, it is very straightforward to show that the Hessian of $h_p$ is positive definite, that is not the point. The two aspects addressed above are the $p \to \infty$ and the inner product. – copper.hat Oct 08 '21 at 00:28
  • 1
    No I think people downvoted your answer because you keep on saying “ it is straightforward “. How is it straightforward? Even in the answers you just added, they all need some tricks for example to reformulate the Hessian in a clever way to apply Cauchy-Schwartz inequality, that is far from being straightforward. – Pew Oct 08 '21 at 00:58
  • @Pew In the context of statistical mechanics, I can assure you it is straighforward. – copper.hat Oct 08 '21 at 01:17
  • @copper.hat Then I am curious to know a straightforward proof from statistical mechanics – Pew Oct 08 '21 at 01:24
  • @Pew I am tired of nitpicking on this site. – copper.hat Oct 08 '21 at 01:24