Subgaussian tail bounds for maximum of subgaussian random variables: concentration around the true expectation?

Question

As in this question, it is not too hard to show that if $X_1,\dots,X_n$ are $\sigma^2$-subgaussian (not even necessarily independent) then $\mathbb{E}[\max_{1\leq i\leq n} X_i] \leq \sqrt{2\sigma^2\log n}$ and $$ \mathbb{P}\!\left\{\max_{1\leq i\leq n} X_i \geq \sqrt{2\sigma^2\log n}+ u \right\} \leq e^{-\frac{u^2}{2\sigma^2}} \tag{$\dagger$} $$ for all $u>0$. Now, in $(\dagger)$ above we have our upper bound on the expected value, not the expected value itself. Both basically coincide in the case of actual Gaussians, but not in general: for subgaussians r'v's, the expected value $\mathbb{E}[\max_{1\leq i\leq n} X_i]$ could be much smaller than that.

Question:

is it possible to replace $\sqrt{2\sigma^2\log n}$ by $\mathbb{E}[\max_{1\leq i\leq n} X_i]$ in $(\dagger)$, either with or without an independent assumption for $(X_i)_i$? (the proof I had in the linked post does not allow that)
if not, what is a counterexample? (for independent r.v's? arbitrary ones?)

Clement C. · Answer 1 · 2021-02-06T02:03:55.653

Not a full answer, but something (inspired by Section 2.2 of this paper) that shows that the independence assumption is crucial.

If not, let $Y_1,\dots,Y_n$ be i.i.d. $N(0,\sigma^2)$, and $\xi$ be a Bernoulli$(p)$ random variable, jointly independent of the $Y_i$'s, where say $p:= 1/6$. Set $$ (X_1,\dots,X_n) = (\xi Y_1,\dots, \xi Y_n) $$ so that the $X_i$'s are still $\sigma^2$-subgaussian, and we have $$ \mathbb{E}[\max_{1\leq i\leq n} X_i] = p\mathbb{E}[\max_{1\leq i\leq n} Y_i] = (1+o(1)) p \sqrt{2\sigma^2\log n} \tag{1} $$

However, for $u = p \sqrt{2\sigma^2\log n}$ (and ignoring the $o(1)$ for convenience), we have $$\begin{align} \mathbb{P}\{\max_{1\leq i\leq n} X_i &\geq \mathbb{E}[\max_{1\leq i\leq n} X_i] + u\} = \mathbb{P}\{\max_{1\leq i\leq n} X_i \geq 2\mathbb{E}[\max_{1\leq i\leq n} X_i]\} \\ &\geq \mathbb{P}\{\max_{1\leq i\leq n} X_i \geq \frac{1}{2}\mathbb{E}[\max_{1\leq i\leq n} Y_i]\} \approx p \tag{2} \end{align}$$ the last inequality as $4p < 1$, and the $\approx$ since, conditioned on $\xi =1$, we have $\max_{1\leq i\leq n} Y_i$ which concentrates very well around its expectation. But, for any constant $c>0$, $$ e^{-c\frac{u^2}{\sigma^2}} = \frac{1}{n^{2cp}} \ll p \tag{3} $$ so (2) and (3) together rule out an upper tail bound involving the true expectation.

Subgaussian tail bounds for maximum of subgaussian random variables: concentration around the true expectation?

1 Answers1