Bound on Difference of Expectations

Question

Let $\Phi$ denote the standard normal distribution function. Let $\Phi(g)$ denote the expectation of $g$ with respect to the standard normal distribution.

Let $X_i$ be iid RV with mean 0 and variance 1. Define $S_n=(X_1+\cdots +X_n) / n^{1/2}$.

Fix $\varepsilon \in (0,1)$. For $z\in \mathbb{R}$ let $h_{z,\varepsilon}$ be the function that equals $1$ for $x \le z$, decends linearly from 1 to 0 on $[z,z+\varepsilon]$ and is $0$ for $x\ge z+\varepsilon$

By drawing some graphs I can see that $$1((-\infty,z]) \le h_{z,\varepsilon} \le 1((-\infty ,z+\varepsilon])$$ but how can I use this to show $$\vert \mathbb{P}(S_n \le z)-\Phi(z)\vert \le \mathbb{E}(h_{z,\varepsilon} (S_n)-\Phi(h_{z,\varepsilon})\vert +\frac{\varepsilon}{\sqrt {2\pi}} ?$$

I can't quite piece this together but I see that $\mathbb{E} 1((- \infty,z](S_n))=\mathbb{P}(S_n \le z)$ and $\Phi(1((-\infty, z+\varepsilon])-\Phi(1((-\infty, z])=\frac{\int_z ^{z+\varepsilon}e^{-x^2/2}dx}{\sqrt{2\pi}}\le \frac{\varepsilon}{\sqrt {2\pi }}$

The source is either moved elsewhere or no longer available on the internet. Please see the supplment post for the source as of the end of 2016.

That's my assumption. I don't know what else that could mean, but I am not completely sure. — cap, Oct 06 '16 at 15:37
What specifically do you mean that "the expectation of $g$ with respect to the standard normal distribution"? May be a definition is helpful. — Gordon, Oct 06 '16 at 15:39
The standard normal expectation of $g$ (the integral over the reals with integrand $g(x)*p(x) $ where $p$ is the standard normal density). — cap, Oct 06 '16 at 15:43
Is there any reference for this question? It appears that you may not be able to reach the inequality you want as the distribution of $X_i$ is unknown. — Gordon, Oct 06 '16 at 18:54
Section 6, page 10 of http://isites.harvard.edu/fs/docs/icb.topic1129937.files/ex2121203.pdf — cap, Oct 06 '16 at 23:58

Lee David Chung Lin · Accepted Answer · 2022-06-20T20:13:04.170

This is only a partial answer, but I think it might help to point out a couple of small misunderstandings due to the notations and wordings of the ~~Problem Set 3~~ material. The most relevant part (section 6 on p.10) is shown right below as a screenshot.

The 2012-2013 course link that still worked at the end of 2016 later became no longer available. Please see the separate community post for the complete set of screeshots of the source pdf.

Target inequality on p.10 of "Problem Set 3" has a typo:$~~$ mixed up $\Phi$

The symbol $\Phi$ plays two roles here: $\Phi(z) = F_Z(z)$ on the L.H.S. is in fact the usual cdf of the standard normal r.v. Z, whereas on the R.H.S. the $\Phi$ should NOT have parentheses and just $\Phi h_{z, \epsilon}~$. It is an integral operator denoting the expectation with respect to $Z$. See the 1998 paper by Galen R. Shorack, the very first paragraph of section 2.

The goal here is to talk about how close $W$ is to standard normal $Z$, and we do that by examining $|\mathbb{P}( W < z) - \mathbb{P}( Z < z) | = |F_W(z) - F_Z(z)|$. That is, on the L.H.S. we look at the difference of the expectation of $1_{(-\infty,\, z)}$ on the two random variables $W$ and $Z$ (perhaps with densities $f_W$ and $f_Z$), and on the R.H.S. we look at the difference of the expectation of $h_{z, \epsilon}$ on these two r.v. $W$ and $Z$:

$$sup_z \left\{ ~\mathbb{E}_W[\, 1_{(-\infty,\, z)} \,] - \mathbb{E}_Z[\, 1_{(-\infty,\, z)} \,]~ \right\} \overset{?}{\leq} sup_z \left\{ ~\mathbb{E}_W[\, h_{z,\epsilon} \,] - \mathbb{E}_Z[\, h_{z,\epsilon} \,]~ \right\} + \frac{\epsilon}{\sqrt{2\pi}}$$

Math error in said target inequality on p.10:$~~$ missing $sup$

Compared with the source, the notes is missing the superior over $z$ the dummy variable ($sup_z$) outside of the absolute value on both sides. This is crucial. Without the $sup_z$ the inequality in fact doesn't hold pointwise $\forall z \in \mathbb{R}$

Consider an easy counterexample for the inequality without $sup_z~$:

at $z \gg 1$, very far out on the positive side, $\phi(z)$ is basically zero throughout $z$ to $z+\epsilon$, so $\Phi(z) \lessapprox \Phi(z+\epsilon) \approx 1$ is basically not increasing. We can have $W$ distributed heavily around this region $[z,\,z+\epsilon]$ such that $F_W(z)$ is small, e.g. $F_W(z) \approx 0.1~$, and $F_W(z+\epsilon)$ is large like 0.8. The actual values can be arbitrarily tuned.

That is, on the L.H.S. $|F_W(z) - \Phi(z)| \approx |0.1 - 1| = 0.9~$, and on the R.H.S. $|F_W(z+\epsilon) - \Phi(z+\epsilon)| \approx |0.8 - 1| = 0.2~$ where the difference is not bounded by $\epsilon/\sqrt{2\pi}$.

At this point, $W$ is unspecified

It is true that down the road (after accepting this inequality to be true) we will be dealing with $W$ as a scaled sum of $X_i~$, but here for this inequality there's no notion of $S_n$ involved. The inequality should hold for general unconstrained random variable (with perhaps some regularity conditions).

This is clear in the source material of Shorack. Taking into account of $S_n$ will only complicate things unnecessarily.

Not an error but a reminder: $~h_{z, \epsilon}$ can be anything

In the source Shorack explicitly says that $~h_{z, \epsilon}$ only needs to be smooth (absolutely continuous) in going from 1 to 0.

This fact also implies that the inequality is not going to be a direct application of $1_{(-\infty,\, z)} \leq h_{z,\epsilon} \leq 1_{(-\infty,\, z+\epsilon)}$ to bound from both sides, since $$ \begin{align} &\hphantom{ {} = {} } sup_z \left\{ ~\mathbb{E}_W[\, 1_{(-\infty,\, z)} \,] - \mathbb{E}_Z[\, 1_{(-\infty,\, z)} \,]~ \right\} \\ &= sup_z \left\{ ~\mathbb{E}_W[\, 1_{(-\infty,\, z+\epsilon)} \,] - \mathbb{E}_Z[\, 1_{(-\infty,\, z+\epsilon)} \,]~ \right\} \\ &= sup_{z'} \left\{ ~\mathbb{E}_W[\, 1_{(-\infty,\, z')} \,] - \mathbb{E}_Z[\, 1_{(-\infty,\, z')} \,]~ \right\} \qquad \text{, $z' \equiv z+ \epsilon$} \end{align}$$ That is, the L.H.S. is the $sup$ scanning over step function at all positions, and the R.H.S. is ... well, the same thing.

Consider a smooth yet sharply curving $h_{z,\epsilon}$ that numerically is very close to a step function. When put into the expression of expectation-difference, this $h_{z,\epsilon}$ is effectively a step function that yields the same $sup$ as its lower bound $1_{(-\infty,\, z)}$ and upper bound $1_{(-\infty,\, z+\epsilon)}$.

In this case, of course, the desired inequality holds easily. The $\epsilon / \sqrt{2\pi}$ additional term is most needed when $h_{z,\epsilon}$ is as "far away" from step function as possible.

So what might that $h_{z,\epsilon}$ look like, going from 1 to 0 through a distance $\epsilon$ but is very "different" from a step function (in terms of the expectation-difference)? As of now this is beyond me, and most likely this has not much to do with the actual derivation of the inequality.

This view point here I'm trying to make is: this inequality states that a step function if ending not sharply but smoothly (like $h_{z, \epsilon}$), its expectation-difference can be smaller than that of a sharp step, while it's not going to drop too much, with the reduction being bounded by $\epsilon /\sqrt{2\pi}$.

Summary of This Answer Post

In the source Shorack said the target inequality (see the end of point.1) follows trivially from the nature/construct of $h_{z,\epsilon}$. I personally don't think this is very trivial. The $sup_z$ is important but it messes up things at the same time: triangular inequality cannot be carried out directly.

I think your (OP: cap) observation about $\epsilon/\sqrt{2\pi}$ being the upper bound of $\int \phi$ over a length of $\epsilon$ is indeed one of the key elements.

As of now I'm trying to argue that $|f_w - \phi| \leq \phi$ for the relevant region of $z$ when taking $sup_z$. There could also be an argument for pointwise inequality $\forall z$ up to a stage, then $sup_z$ considerations take over and move the inequality forward. These thoughts might very well lead to nowhere, though.

I just realized that the target inequality (at the end of my point 1) is indeed trivial: $\mathbb{E}W[, 1{(-\infty,, z)} ,] < \mathbb{E}W[, h{z,\epsilon} ,] $ to lift the 1st term, and $\mathbb{E}Z[, h{z,\epsilon} ,] < \mathbb{E}W[, 1{(-\infty,, z+\epsilon)} ,] \implies \mathbb{E}Z[, h{z,\epsilon} ,] < \mathbb{E}W[, 1{(-\infty,, z)} ,] +\epsilon/\sqrt{2\pi}$ ($\because$ Normal density $<1$) so that $\mathbb{E}W[, 1{(-\infty,, z)} ,] > \mathbb{E}Z[, h{z,\epsilon} ,] - \epsilon/\sqrt{2\pi}$ to suppress the 2nd term that is being subtracted. — Lee David Chung Lin, Oct 08 '17 at 21:44

Lee David Chung Lin · Answer 2 · 2022-06-20T20:06:00.327

This separate (community) post is to supplement the screenshots the source for the answer post, as well as to elaborate some potential issues regarding the math content.

The source in question is isites.harvard.edu/fs/docs/icb.topic1129937.files/ex2121203.pdf. There are totally 16 pages of as of the end of 2016. The most relevant part is section 6 on p.10, but all 16 pages are displayed below after the text. Page number is at the bottom of each page. Links to individual pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, and 16.

The 1998 paper by Galen R. Shorack is a tech report within Univ. Washington titled On Stein's Approach to the Central Limit Theorem. The same content with slightly nicer typeset can be seen on page 255 in an older version of his textbook published in the year 2000.

For unknown reasons, in later digital versions of the textbook (still wrapped around the "2000 edition"), which presumably was maintained or at least approved by Shorack himself for course use as of 2012, the Stein's method is no longer covered. One cannot help but wonder if the removal of this material has anything to do with potential issues therein.

Having said that, the 1998 tech report is still available to the public (no retraction, nor errata or disclaimer) so whatever mathematical issues it might have could be only minor. Speaking of which, as long as the Univ. Washington stands, currently it isn't necessary to also convert the 1998 tech report into screenshots.

Berry Esseen via Stein page 01 with file name