7

I want to solve the following problem but I can't put all the tools together properly:

enter image description here


My attempt:

We consider the sets $$A_n \equiv \{E\left(1-\exp(-g) \right) : g \in \text{conv}(f_j : j \ge n) \}$$

and define $s_n \equiv \sup(A_n)$. Noting that $g \ge 0 \quad \forall g \in \text{conv}(f_j : j \ge n)$, $1-\exp(-g) \in [0,1]$ so that $\{s_n\}_{n \in \mathbb{N}}$ is a bounded, decreasing sequence (as $A_{n+1} \subseteq A_n$) and so has a limit $s = \inf_n s_n$.

By definition of the supremum $s_n$, there exist $g_n \in \text{conv}(f_j : j \ge n)$ such that $$s_n - \frac{1}{n} \leq E(1- \exp(-g_n)) \leq s_n $$

The idea is to show that $g_n$ converges in $L^1$, which would be shown if it were Cauchy in $L^1$, or even if it converged in probability (since $|g_n| \leq K$, we have a uniformly integrable family so that convergence in probability implies $L^1$ convergence). I do not know how to do this. Any ideas?

Some of my (maybe useless ones): We clearly have $E( 1- \exp(-g_n)) \rightarrow s$, and if we could show that $1-\exp(-g_n)$ is Cauchy in measure (or $L^1$) then we would have $1-\exp(-g_{n})$ converges to a limit $X$ in probability, which would imply that $g_{n}$ converges to $-\log(1-X)$ in probability by the continuous mapping theorem and then we would be done.

qp212223
  • 1,626
  • Why are you considering $1-\exp(-g)$? What material/results directly preceeded this problem? – Michael Aug 18 '20 at 16:52
  • There was a hint given on the page following the question. The only thing written in it was to consider the supremum of the $A_n$ sets. Nothing else precedes or follows the question – qp212223 Aug 18 '20 at 16:59
  • Is this a book? A book usually has some material that relates to the problems. E.g. Hanh-Banach variants or some time of convergence results. – Michael Aug 18 '20 at 17:01
  • No. It's from a practice exam so I don't have any idea why this hint was given. I don't know how to apply the Hahn Banach theorem (or variants of it) here because I'm not familiar with it it all. I was thinking more about the problem and I'm fairly sure it would suffice to show that we could choose the sequence $g_n$ to be monotone (since $1- \exp(-x)$ is also a monotone function). Is this possible? – qp212223 Aug 18 '20 at 17:10
  • At a high level, the uniform integrability should give you compactness in the weak topology of $L^1$, which intuitively should guarantee that something converges, and by this result you can pass to convex combinations to get strong convergence. There are some gaps to be filled in because, for example, compactness in the weak topology is not the same as sequential compactness. – Nate Eldredge Aug 18 '20 at 17:52
  • There should be a more elementary proof in this specific case, but I don't see it offhand. – Nate Eldredge Aug 18 '20 at 17:52
  • Thanks for the link. My math background is honestly too weak to understand what's going on with Hahn-Banach so I'll have to read up to understand how to fill in those gaps (I'm from a statistics background and my knowledge of arbitrary functional analysis things is weak). It is perhaps better for me to focus on the more elementary proof for now – qp212223 Aug 18 '20 at 18:11
  • It would be easier if $f_n$ were random variables, and if they were pairwise uncorrelated: You could use Markov/Chebyshev inequality to show $\frac{1}{n}\sum_{i=n}^{2n-1} f_n$ closely approaches $m_n := \frac{1}{n}\sum_{i=n}^{2n-1}E[f_n]$, and use Bolzano-Wierstrass for existence of a subsequence $n_j$ for which $m_{n_j}\rightarrow m \in [0,K]$. This would prove convergence to a constant. – Michael Aug 18 '20 at 18:11

1 Answers1

4

The crux of this question is that if there are multiple "almost" minimizers of a strongly convex function, they cannot be too far from each other. (Think of $x^2$ and how everything should be close to origin if they are almost minimizing $x^2$.) Note that I will be working with minimizing $e^{-g}$ as opposed to maximizing $1 - e^{-g}$; I don't fully get why they gave that special form since $e^{-g}$ would also have been bounded for the functions they gave.

More rigorously, let's note the following:

Let $f$ be a strongly convex function with parameter $\mu$. Then, we have that:

$$f(\alpha x + (1 - \alpha)y) \leq \alpha f(x) + (1 - \alpha) f(y) - \frac{\alpha(1-\alpha)\mu}{2} |x - y|^2 $$

which is an equivalent definition of strong convexity.

Note that $e^{-x}$ is a strongly convex function with parameter $e^{-K}$ in the interval [0, K]. So, we can further deduce the following:

Let $A_n = \{E[e^{-g}] g \in (f_n, ..., ) \}$ and let $s_n = \inf A_n$, and let $g_n$ be chosen so that you are again in $\frac{1}{n}$ of the infimum, and let $n < m$ WLOG. Note that for us, $s_n$ is an increasing sequence, and let's denote the limit as $s$. Then, we must have

$$E [e^{-(\alpha g_n + (1 - \alpha) g_m)}] \leq \alpha (s_n + \frac{1}{n}) + (1 - \alpha) (s_m + \frac{1}{m}) - C E[|g_n - g_m|^2]$$

Now, note that $\alpha g_n + (1 - \alpha) g_m$ is also in the convex set of $(f_n, ..., f_m, ...)$, so, LHS is at least $s_n$ by definition. So, now, we have

$$s_n \leq \alpha s_n + (1-\alpha) s_m + \frac{1}{n} - C E[|g_n - g_m|^2]$$

Let $\epsilon > 0$ be arbitrary and let $n, m$ be chosen large so that $s_n, s_m$ both are at least $s - \epsilon$ (Note that they are at most $s$)

$$E[|g_n - g_m|^2] \leq \frac{1}{C} (\epsilon + \frac{1}{n}) $$

which gives us that $g_n$ is Cauchy in $L_2$ .

E-A
  • 5,987
  • 2
    Interesting use of strong convexity. Why not just use $x^2$, instead of $e^{-x}$? – Michael Aug 21 '20 at 03:03
  • 2
    I think for this part any strongly convex function works (since the functions are bounded), but I wanted to stick to using the same notation because OP said this was the hint given to them. Presumably this way they get to have some uniform boundedness that could perhaps generalize some argument given here (maybe that is part b or c of this question). (It is also possible that this argument is wrong somewhere :) so in case I need to use some exponential in an edit, I can do only a minor edit.) – E-A Aug 21 '20 at 03:10
  • 1
    Minor typo, I think your $\epsilon/2$ at the end should just be $\epsilon$ since you have $$s - \epsilon \leq \alpha s + (1-\alpha)s + 1/n - CE[|g_n-g_m|^2] \implies E[|g_n-g_m|^2]\leq (1/C)(\epsilon + 1/n)$$ – Michael Aug 21 '20 at 18:57
  • Thanks for the catch! – E-A Aug 21 '20 at 19:03
  • 1
    Part (b) of the question is the case when $K = \infty$, but instead of $L^1$ convergence we need to prove convergence in probability, so that is indeed the reason why they suggested something that's bounded. I'd guess the idea would be to show $E(|g_n - g_m|\wedge 1) \rightarrow 0$ since this is equivalent to Cauchy in probability, but I wouldn't know how to do that either based on this strategy, since $e^{-x}$ is no longer strongly convex on $[0, \infty)$. Thanks for this proof! – qp212223 Aug 21 '20 at 22:50
  • No worries; thanks for the bounty! If you post the other question, I will definitely give it a thought, though even this was not easy, and like you said, because of loss of strong convexity, I don't know how one could push an argument like this further. Also, can you share where you are getting these questions? I am in course staff for a probability course for EEs, so I can certainly be inspired by similar questions while preparing discussion questions. – E-A Aug 21 '20 at 23:00