11

As the book Probabilistic Techniques in Analysis by Richard F. Bass shows, nowadays techniques drawn from probability are used to tackle problems in analysis.

The mentioned book presents a survey of these methods "at the level of a beginning Ph.D. student", but I would like to see some examples of "more basic" applications of probability to undergraduate (so to say) real analysis (in other words, I would like to see some applications of probabilistic reasoning to calculus problems).

Dal
  • 8,214
  • Here's another use of probability in analysis:

    http://math.stackexchange.com/questions/215352/why-is-gamma-left-frac12-right-sqrt-pi/215373#215373

    – Michael Hardy Dec 21 '14 at 03:11
  • Here is a question that is related http://math.stackexchange.com/questions/1075819/a-treatise-on-probabilistic-arguments-and-laplace-fourier-transforms-to-solve-li – user153330 Dec 21 '14 at 17:51
  • @user153330 Thank you for the link! I hope we both get some good answers! :) – Dal Dec 21 '14 at 17:55
  • @Dal Yay, I really hope so! I will add a bounty tomorrow to see if I can get an answer! ;) – user153330 Dec 21 '14 at 17:59

5 Answers5

11

Here is an example.

The question is how to show that $$\binom{n}{k}^{-1}=(n+1)\int_0^1 x^k (1-x)^{n-k} \, dx. $$

To make this self-contained, I'll paste this answer below:

Let's do it somewhat like the way the Rev. Thomas Bayes did it in the 18th century (but I'll phrase it in modern probabilistic terminology).

Suppose $n+1$ independent random variables $X_0,X_1,\ldots,X_n$ are uniformly distributed on the interval $[0,1]$.

Suppose for $i=1,\ldots,n$ (starting with $1$, not with $0$) we have: $$Y_i = \begin{cases} 1 & \text{if }X_i<X_0 \\ 0 & \text{if }X_i>X_0\end{cases}$$

Then $Y_1,\ldots,Y_n$ are conditionally independent given $X_0$, and $\Pr(Y_i=1\mid X_0)= X_0$.

So $\Pr(Y_1+\cdots+Y_n=k\mid X_0) = \dbinom{n}{k} X_0^k (1-X_0)^{n-k},$ and hence $$\Pr(Y_1+\cdots+Y_n=k) = \mathbb{E}\left(\dbinom{n}{k} X_0^k (1-X_0)^{n-k}\right).$$

This is equal to $$ \int_0^1 \binom nk x^k(1-x)^{n-k}\;dx. $$

But the event is the same as saying that the index $i$ for which $X_i$ is in the $(k+1)$th position when $X_0,X_1,\ldots,X_n$ are sorted into increasing order is $0$.

Since all $n+1$ indices are equally likely to be in that position, this probability is $1/(n+1)$.

Thus $$\int_0^1\binom nk x^k(1-x)^{n-k}\;dx = \frac{1}{n+1}.$$

7

The Weierstrass approximation theorem says that continuous real-valued functions on the unit interval can be uniformly approximated arbitrarily closely by polynomials. That is, for every continuous function $f:[0,1]\to\mathbb{R}$ and each $\varepsilon>0$, there is a polynomial $p$ such that $|f(x)-p(x)|<\varepsilon$ for all $x\in[0,1]$.

An elementary probabilistic proof goes as follows:

Let $U_1,U_2,\ldots$ be independent uniformly distributed random variables on $[0,1]$. For $n\in\mathbb{N}$, define a function $p_n:[0,1]\to\mathbb{R}$ by $$p_n(x) \triangleq \mathbb{E}\Big[\,f\Big(\frac{1_{U_1<x}+1_{U_2<x}+\cdots+1_{U_n<x}}{n}\Big) \Big]\;, $$ where $1_{U_i<x}$ is the indicator random variable of the event $\{U_i<x\}$. For brevity, let us write $\overline{X}_n(x)\triangleq \frac{1}{n}(1_{U_1<x}+1_{U_2<x}+\cdots+1_{U_n<x})$, so that $p_n(x)=\mathbb{E}[f(\overline{X}_n(x))]$. Note that $p_n(x)$ is a polynomial of degree $n$. Intuitively, by the law of large numbers, $\overline{X}_n(x)$ is going to be close to $x$ when $n$ is large. Hence, $f(\overline{X}_n(x))$ will also be close to $f(x)$.

To make this precise, let $\varepsilon>0$. A continuous function on a compact space is uniformly continuous and bounded. Therefore, there is a $\delta>0$ such that $|f(x)-f(y)|<\varepsilon/2$ for each $x,y\in[0,1]$ satisfying $|x-y|<\delta$. Moreover, there is a constant $c<\infty$ such that $|f(x)|<c$ for each $x\in[0,1]$.

Now, for each $x\in[0,1]$, we have \begin{align} |f(x)-p_n(x)| &= \Big|\mathbb{E}\big[f(x)-f(\overline{X}_n(x))\big]\Big| \\ &\leq \mathbb{E}\big|f(x)-f(\overline{X}_n(x))\big| \\ &< \underbrace{\mathbb{P}\big(|\overline{X}_n(x)-x|<\delta\big)}_{\leq 1} \frac{\varepsilon}{2} + \underbrace{\mathbb{P}\big(|\overline{X}_n(x)-x|\geq\delta\big)}_{ \text{via Chebyshev's} }\,c \;. \end{align} By Chebyshev's inequality, we have $$ \mathbb{P}\big(|\overline{X}_n(x)-x|\geq\delta\big) \leq \frac{\mathrm{Var}[\overline{X}_n(x)]}{\delta^2} = \frac{x(1-x)}{n\delta^2} \leq \frac{1}{4n\delta^2} \;, $$ which is smaller than $\frac{\varepsilon}{2c}$ for $n\geq\frac{c}{2\varepsilon\delta^2}$. It follows that $|f(x)-p_n(x)|<\varepsilon$ for all $x\in[0,1]$, provided $n\geq \frac{c}{2\varepsilon\delta^2}$, and this concludes the proof.

Blackbird
  • 1,985
3

Thomas Bayes showed that $${n \choose k}\int_0^1 x^k (1-x)^{n-k}\mathrm dx = \frac{1}{n+1}$$ by pure thought, without using calculus (for all integers $k,n$ with $0 \leq k \leq n$). His argument, known as the Bayes' billiards argument, uses two equivalent probabilistic stories about picking random points on a number line from $0$ to $1$.

That's just one example though, not a general technique. Generalizing, a powerful technique that's not normally emphasized in math courses is probabilistic interpretation. For example, there are many integrals that, after some pattern matching and possibly in tandem with other techniques such as substitution, integration by parts, and differentiation under the integral sign, can be interpreted at the integral of a known probability density function, or as a moment of a known distribution, or as a convolution integral. The Normal, Beta, and Gamma distributions are especially important in this context.

You would also find a good explanation for probabilistic argument of this integral in the post of intuition for the derivation of beta function by Qiaochu Yuan.

Venus
  • 10,966
1

This problem that has a bounty on it, contains a reference to a solution using probability theory. I think the probability solution is quite elegant. It works though by showing an even stronger result than the one asked for. So, I suspect another proof must exist, but that proof might be more complex.

Raskolnikov
  • 16,108
0

Since there's already an answer about probabilistic methods being used to compute an integral, here's another example I remember coming across recently: Question from MIT integration Bee 2023 final: Evaluate $\int^1_0 (\sum^\infty_{n=0}\frac{\left\lfloor 2^nx\right\rfloor}{3^n})^2{\rm d}x$.

The clever idea is to write the sum in the integrand in terms of functions $X_k:[0,1] \to \{0,1\}$ defined by $X_k(x):=$ "the $k$th digit in the binary expansion of $x\in [0,1]$", which have the advantage of being i.i.d. Bernoulli($\frac 12$) variables on the probability space $[0,1]$ with Lebesgue measure. This turns the difficult-looking integral into an expectation.

Because we can calculate the expectation of polynomial expressions of these $X_k$ very easily (using independence to break apart multiplication, linearity to break apart sums, and of course $\mathbb E[X_k] = \frac 12$ for all $k$), the integral is then quite easy to evaluate.

D.R.
  • 8,691
  • 4
  • 22
  • 52