2

Let $\vec{a}=\{a_1,\dotsc,a_n\}\in\mathbb{R}^n$ be a vector such that $$ 0 \le a_i \le \frac{1}{2} \text{ for each } 1\le i\le n\enspace. $$ Consider the function $f ~:~ \mathbb{R}^+ \to\mathbb{R}$: $$ f(x) = \frac{1}{x}\ln\sum_{i=1}^n e^{a_i x^2} \enspace. $$ (I only care about its restriction to the positive reals, so all mentions of strict convexity in the following are intended to be on the positive reals.)

We can see this function as the product of two strictly convex functions $g(x)$ and $h(x)$, with:

  • $g(x) = \frac{1}{x}$
  • $h(x) = \ln\displaystyle\sum_{i=1}^n e^{a_i x^2}$

The function $h(x)$ is strictly convex because we can see it as the composition of two strictly convex functions $z(y)$ and $w(x)$ (i.e., $h(x)= z(w(x))$, with $z(y)$ being non-decreasing:

  • $z(y) = \ln\displaystyle\sum_{i=1}^n e^{a_i y}$, which is strictly convex because it is the slice of the $n$-dimensional log-sum-exp function along the direction $(\vec{0} + \vec{a}x)$, which is strictly convex as shown in this answer.
  • $w(x) = x^2$

(I'm assuming the strictness of the convexity is preserved when composing two strictly convex functions with the "outer" one being non-decreasing, but I actually do not know if it is true).

I understand that in general the product of two strictly convex functions may not be convex, but I suspect (based on some simulations trying various choices of the vector $\vec{a}$) that $f(x)$ is convex on the positive reals.

Could anyone please give some ideas on how to prove that the function is convex (or tell me that it isn't convex) ?

The second derivative of $f(x)$ doesn't immediately seem non-negative on the positive reals, but someone may have more intuition and more technical ability than me.

In case the function $f$ is not convex, this question may suggest that $f$ would have at most two local minima because it is the product of two strictly convex functions (I'm saying "may" because 1) the question mentions strongly convex functions, but it is not clear whether the Author meant strictly convex functions; and 2) there is no answer and the comments are a bit vague).

Can anyone please confirm that indeed $f$ would have at most two local minima (and if possible provide a reference)?

Thank you.

Matteo
  • 335
  • That log-sum-exp is convex follows from the convex conjugate of a convex function. Have you tried taking the conjugate of $f$ for $n=2$? The function $z$ is not convex if all $a_i$ are the same btw. – LinAlg Mar 08 '18 at 03:05
  • @LinAlg: I haven't tried, but I will, thank you for the suggestion. My understanding was that if all the $a_i$ are the same, then the function $z$ is affine, so it is still convex but not strictly (seethe linked answer). Am I wrong? – Matteo Mar 08 '18 at 13:03
  • @LinAlg : actually, I have no clue how to proceed, in the sense that I don't know how studying the convex conjugate would help me. What property of it is it used to prove the convexity of log-sum-exp? – Matteo Mar 08 '18 at 14:12
  • indeed, not strictly convex. The property you use is that a conjugate is always convex. – LinAlg Mar 08 '18 at 14:20
  • @LinAlg, ok, but that's true independently of whether the original function is convex or not. I'm missing a step here. Do you have a reference for the proof that log-sum-exp is convex because its convex conjugate is? – Matteo Mar 08 '18 at 14:22
  • 1
    The correct argument is that a convex conjugate is convex. So you would have to derive the double conjugate and see if it is the same as the function you started with. – LinAlg Mar 08 '18 at 14:40
  • @LinAlg, ok, I got it now. Thank you. By any chance, do you have any idea about the second part of my question (two local minima?). Thanks – Matteo Mar 08 '18 at 14:50
  • 1
    The comments in that question give a clear counterexample that has infinitely many optima. – LinAlg Mar 08 '18 at 15:24

1 Answers1

1

As you already know, the the function $h(x) = \log\left(\sum_{i=1}^n e^{a_i x} \right)$ is convex on on $\mathbb{R}$. Another well-known theorem is regarding the perspective of a convex function, which implies that the following function is jointly convex in $(x,t)$ on the set $\mathbb{R} \times \mathbb{R}_{++}$, which also implies convexity on $\mathbb{R}_{++}^2$. $$ q(x, t) = t~h(x/t) = t \log\left( \sum_{i=1}^n e^{a_i x/t} \right), $$ where $\mathbb{R}_{++}$ are the positive reals. Now, for $t > 0$, let us compute: $$ \begin{aligned} \frac{\partial q}{\partial t}(x, t) &= \log\left( \sum_{i=1}^n e^{a_i x/t} \right) + t \frac{\sum_{i=1}^n \left(e^{a_i x/t} \cdot (-a_i x/t^2) \right) }{\sum_{i=1}^n e^{a_i x/t}} \\ &= \log\left( \sum_{i=1}^n e^{a_i x/t} \right) - \frac{x}{t} \underbrace{\frac{\sum_{i=1}^n a_i e^{a_i x/t}}{\sum_{i=1}^n e^{a_i x/t}}}_{\text{Weighted avg. of $a_i$}} \\ &\geq \max_{i=1,\dots,n} \{a_i x/t \} - \frac{x}{t} \max_{i=1,\dots,n} \{ a_i \} \\ &= \frac{x}{t} \max_{i=1,\dots,n} \{a_i\} - \frac{x}{t} \max_{i=1,\dots,n} \{ a_i \} &\leftarrow \frac{x}{t} > 0 \\ &= 0. \end{aligned} $$ The inequality follows from the following:

  • LogSumExp is greater or equal to the maximum
  • Average is less than or equal than the maximum. Thus, negative average is greater or equal than negative maximum.

Thus, $q$ is nondecreasing in $t$ on $\mathbb{R}_{++}^2$. Since $\phi(y) = 1/y$ is convex on $y>0$ and positive, the following composition is convex: $$ g(x, y) = q(x, \phi(y)) = \frac{1}{y} \log\left( \sum_{i=1}^n e^{a_i xy} \right). $$ Finally, the linear transformation $y = x$ leads to the convexity of: $$ f(x) = g(x, x) = \frac{1}{x} \log\left( \sum_{i=1}^n e^{a_i x^2} \right) $$ Note that the proof does not use anywhere any fact about the product of convex functions.

  • Thank you @Alex, I'll check and let you know. By the way, this back and forth with you has been incredibly educational for me, so thank you for teaching me new things. – Matteo Mar 11 '18 at 15:20
  • everything seems correct to me. I made a small edit ("non-decreasing" instead of "increasing" for $q$). Thanks again, and kudos to your knowledge and skills. – Matteo Mar 11 '18 at 23:25