2

I am referring to the last paragraph in the proof of Proposition 4.7 on page 12 in Concentration inequalities for order statistics.

Consider $(\varepsilon_{i})_{i=1,\dots,n}$ iid random signs $\mathbb P(\varepsilon_{i} = 1)=\mathbb P(\varepsilon_{i}=-1)=\frac{1}{2}$ and define $N=\sum\limits_{i=1}^{n}\frac{1+\varepsilon_{i}}{2}$. Define the random harmonic number $H_{N}=\sum\limits_{j=1}^{N}\frac{1}{j}$.

It is said that using the Efron-Stein inequality on $Z_{i}=\sum\limits_{j=1}^{N-\frac{1+\varepsilon_{i}}{2}}\frac{1}{j}$, we obtain:

$$\text{Var}(Z)\leq \mathbb E\left[0\land \frac{1}{N}\right]\tag{*}$$ where $0\land1/N$ is an abuse of notation to state that it is equal to $0$ when $N=0$ and otherwise $1/N$ for $N>0$.

I think this is a mistake: it should rather be $\leq\mathbb E [T]$ with $T= 0\chi_{\varepsilon_{i}=-1}+\frac{1}{N}\chi_{\varepsilon_{i}=1}$.

It is further estimated that by Hoeffding's inequality $$\mathbb E\left[0\land \frac{1}{N}\right]\leq \exp(-n/8)+4/n\leq 8/n\tag{**}$$

I do not understand the inequality $(*)$, nor do I understand the two inequalities in $(**)$, can anyone help explain them?

SABOY
  • 1,828

1 Answers1

2

Since $N=\sum\limits_{i=1}^n1_{\varepsilon_i=1}$, we have $Z=Z_i+1_{\varepsilon_i=1}/N$ for all $i$. If $N=0$ then it forces each $\varepsilon_i=-1$ so $\Bbb V[Z]=0$. Thus $N>0$ nontrivially. From the Efron-Stein inequality, $$\Bbb V[Z]\le\frac12\sum_{i=1}^n\Bbb E[(Z-Z_i)^2]=\frac12\sum_{i=1}^n\Bbb E\left[\frac{1_{\varepsilon_i=1}}{N^2}\right]=\frac12\Bbb E\left[\frac1N\right]\le\Bbb E\left[\frac1N\right]$$ by linearity of expectation, and combining with $N=0$ yields $\Bbb E[0\land1/N]$.

For the second part, I currently don't see how Hoeffding's inequality is applicable since it considers the probability that $1/N$ deviates from its expected value. I think the use of Hoeffding's lemma is more likely, but it is unclear to me how that would work — note that writing $\Bbb E[1/N]=\int_{\Bbb R^+}\Bbb E[e^{-Nt}]\,dt$ to use the lemma is futile since it gives an infinite upper bound.

So here is another approach. As the RHS is positive, it suffices to consider $\Bbb E[1/N]$ for $N>0$. By the discrete definition of expectation, $$\Bbb E\left[\frac1N\right]=\sum_{k=1}^n\frac1k\operatorname P(N=k)=\sum_{k=1}^n\frac1k\frac{\binom nk}{2^n}<\frac2n+\frac4{n^2}$$ from this probabilistic argument, which is evidently less than $8/n$.

As a final note, the inequality $e^{-n/8}+4/n\le8/n$ is equivalent to $8\log n-n\le8\log4$ which is true since $8\log n-n$ attains its maximum at $n=8$.