Probability that a head eventually turns up (From Grimmett and Stirzaker)

Question

This problem is from the exercise section of Grimmett and Stirzaker. I have also listed my proof.

Question.

A fair coin is tossed repeatedly.

(a) Show that a head turns up sooner or later with probability one.

(b) Show similarly that any given finite sequence of heads and tails occurs eventually with probability one.

(c) Explain the connection with Murphy's law.

Proof.

Let $A_{1},A_{2},A_{3},...,A_{n}$ be the events that a head turns up on the $(1,2,3,...,n)$-th toss after a sequence of $n-1$ tails. This can be visualized in the form of binomial tree. The sample space $\Omega=\{H,TH,TTH,TTTH,...\}$ is a countably infinite set.

The probability measure $P$ is countably additive over disjoint sets.

$P(\bigcup\limits_{i=1}^{\infty}{A_{i}})=\sum_{i=1}^{\infty}P(A_{i})$

In this example, all the members of $\mathscr{F}$ are disjoint.

Hence, $P(\bigcup\limits_{i=1}^{\infty}{A_{i}})=P(A_{1})+P(A_{2})+...=(1/2)+(1/2^2)+(1/2^3)+...\infty=1$.

Thus, a head will turn up with a probability 1.

I would like to know, if this proof is correct and sound. Are there alternative approaches to it? Also, I was wondering, how I can extend it to a sequence of heads and tails of length $m$. What is Murphy's law? How do I define the $\sigma$-field $\mathscr{F}$?

@stochasticboy321's answer explains the correct and short way to prove the result. If one tries to stick to the approach in the question, one should modify it as follows. First, ${H,TH,TTH,TTTH,\ldots}$ should be $\Omega$ rather than the sigma-algebra $\mathcal F$ on it... except that defining $\Omega$ as ${H,TH,TTH,TTTH,\ldots}$ would be taking for granted the fact that almost surely a head turns up. An option would be to add an extra outcome to $\Omega$, say the outcome $T$ meaning that no $H$ ever occurs. Then $A_i={T^{i-1}H}$ and $P(A_i)=1/2^i$ for every $i\geqslant1$ ... — Did, Nov 22 '15 at 20:38
... hence $\sum\limits_{i=1}^\infty P(A_i)=1$, which implies that $P({T})=0$ and that $\bigcup\limits_{i=1}^\infty A_i$ has full probability, as desired. Or, one sticks to $\Omega={H,TH,TTH,TTTH,\ldots}$ with $P({H})=1/2$, $P({TH})=1/4$, $P({TTH})=1/8$, $P({TTTH})=1/16$, etc., and one notices that this fully defines a probability $P$ since $1/2+1/4+1/8+1/16+\cdots=1$. In the end, @stochasticboy is probably too nice to write that "Your proof is correct" but the alternative they propose is definitely correct and worth fully understanding. — Did, Nov 22 '15 at 20:41
@Did Thanks for your very elaborate replies. This certainly helps a lot. I still have one question on my mind - how do I define the sigma algebra $\mathscr{F}$ here, or for that matter any other problem. It supposed to be the set of events I am interested in and must satisfy the properties of a $\sigma$-field, right? — Quasar, Nov 24 '15 at 03:41
Not sure $\mathscr F = 2^{\Omega}$ can work: http://math.stackexchange.com/questions/1504887/why-is-it-that-mathscrf-ne-2-omega — BCLC, Nov 26 '15 at 19:35

score 2 · Answer 1 · answered Nov 20 '15 at 04:56

2

Your proof is correct, and extending it is probably possible, although I don't immediately see a way. An alternate way, which is easier to extend to the second case is the following.

The probability that no heads occur in $N$ tosses is $2^{-N}$. As $N \to \infty$, this goes to $0$. Thus, the probability that at least one head occurs in an infinite experiment is $1$.

This is easier to extend this as follows:

Pick any sequence of $m$ digits. The probability that it doesn't occur in $N$ tosses is upper bounded by the probability that it doesn't occur as sequence $(x_{jm - m}, \dots, x_{jm})$ for $1 \le j \le \lfloor N/m \rfloor$. The latter probability is $\left(\frac{2^m - 1}{2^m}\right)^{\lfloor N/m \rfloor} \to 0$ as $N \to \infty$.

Murphy's law is a tongue in cheek folk-saying that states that anything that can go wrong, will go wrong. Through the above, this is true for, well, coins, assuming some finite sequence makes things wrong for you, and if you're willing to hang around for a while.

answered Nov 20 '15 at 04:56

stochasticboy321

9,003

2

stochasticboy321 - I didn't understand your m-digits logic. Could you explain it in simpler terms! – Quasar Nov 21 '15 at 02:33
Suppose the finite length sequnce you were interested in had $m$ digits, and that you were considering an experiment on $N$ coin tosses. The possible ways the $m$ digit sequnce could occur is $ > N - m +1$. I'm considering only $\lfloor N/m \rfloor$ of these. Consequently, you get a lower bound above. The bound can intuitively be seen by granting the $m$ letter sequence a probability $2^{-m}$ and by flipping a coin such that the probability of 'heads' is $2^{-m}$. The probability that a sequence doesn't have any heads in this case is $\le$ probability given above – stochasticboy321 Nov 21 '15 at 04:23
How is the proof correct? $P(A_n)$ is wrong and that is not a sigma-algebra – BCLC Nov 22 '15 at 11:34
1

I presume you are the downvote? If you're talking about the second bit of the proof, I don't need to consider the whole sigma algebra. The expression is merely an upper bound on the probability that the sequence does not occur, and this goes to $0$ as the length of the experiment goes to $\infty$. That's good enough to show that any sequence occurs. – stochasticboy321 Nov 22 '15 at 11:37
1

If your point is about the proof presented in the question, OP considers disjoint events and shows that their union a.s. occurs. How is that not good enough? – stochasticboy321 Nov 22 '15 at 11:38
stochasticboy321 1 The $\mathscr F$ given by OP is not a $\sigma$-algebra. 2 $P(A_n) = 1/2$ not $(1/2)^n$ 3 The summation-union equality is justified from the fact that the $A_n$'s are pairwise disjoint not merely disjoint. 4 $A_n$ is not helpful. We want the event that the first n-1 flips are tails and then the nth flip is a head. That is not $A_n$. That is $A_1^c \cap A_2^c \cap ... \cap A_{n-1}^c \cap A_n$ 5 I downvoted because of 'Your proof is correct' – BCLC Nov 22 '15 at 19:09
@stochasticboy321 Frankly I didn't see where you're coming. You could have literally start with $N-m+1$ of said sequences. It got me thinking and made me realize you have been using permutation the whole time, otherwise it would be multiplication instead of $\frac{2^m-1}{2^m}$ with that 1 being the $m$ sequence. My question to you is, since the permutation of $\lfloor\frac{N}{m}\rfloor$ is easier as opposed to the overlapping permutation of $N-m+1$, you opted for the former, which effectively serves as one of the lower bounds? – Andes Lam Sep 15 '21 at 13:49
1

Indeed - as you said, there are $N-m+1$ possible places for the bad sequence to begin, and I've picked out some $\lfloor N/m \rfloor$ of them in order to say that the probability that the bad sequence doesn't occur is at most the probability that the bad sequence doesn't occur when starting from these particular places. The important thing here is that this choice of these starting sites divides the coin sequence into disjoint chunks, and one can thus easily control the latter probability using the independence of the coins (contd) – stochasticboy321 Sep 15 '21 at 15:08
To do a more thorough calculation would need to handle the dependencies between overlapping length $m$ subsequences (in a way that will interact with the actual 'bad subsequence'), which is unnecessary work for this question. I don't actually see the role of any permutation here, btw. Could you elaborate? – stochasticboy321 Sep 15 '21 at 15:09

score 0 · Answer 2 · edited Apr 13 '17 at 12:21

I don't think your proof is right because of the definitions of $A_n$ and $\mathscr F$. Are the flips independent?

Consider a probability space $(\Omega, \mathscr F, \mathbb P)$ where

$\Omega = \{H,T\}^{\mathbb N}$

So we have $\omega = (\omega_1, \omega_2, ...)$ where $\omega_n \in \{H, T\} \ \forall n \in \mathbb N$

$\mathscr{F} = \sigma(\omega_n = W | \ W \in \{H, T\})$ like here (because I guess $\mathscr{F} = 2^{\Omega}$ doesn't work)
$P(\omega_n = H) = P(\omega_n = T) = 1/2$

where

$(\omega_n = H) = (\omega_1, \omega_2, ..., \omega_n = H, ...)$

$(\omega_n = T) = (\omega_1, \omega_2, ..., \omega_n = T, ...)$

In your case, $(\omega_n = H) = A_n$

So $P(A_n) = 1/2$ not $\frac{1}{2^n}$

Let $H_1, H_2, ...$ be events where $H_n$ = {nth flip is heads and 1st, ..., (n-1)th flips are tails}.

Thus, we have $$H_n = A_1^C \cap A_2^C \cap ... \cap A_{n-1}^C \cap A_n$$

$$=\bigcap_{k=1}^{n-1} [A_k^C] \cap A_n$$

Assuming independence of the flips i.e. independence of the $A_n$'s, $$P(H_n) = [\prod_{k=1}^{n-1} P(A_k^C)] \times P(A_n) = \frac{1}{2^n}$$

Now for Q1, we want to show that at least one of the flips will be heads

Or:

$$P(\bigcup_{n=1}^{\infty} A_n) = 1$$

Or:

Almost surely, $\forall \omega \in \Omega$,

$$\omega \in \bigcup_{n=1}^{\infty} A_n$$

Or:

$$\exists z \ge 1 s.t. P(A_z) = 1$$

Or:

Almost surely, $\forall \omega \in \Omega$,

$$\exists z \ge 1 s.t. \omega \in A_z$$

One route: Now we can do what you attempted earlier because the $H_n$'s are pairwise disjoint (*): $$P(\bigcup_{n=1}^{\infty} H_n) = \sum_{n=1}^{\infty} P(H_n) = 1 \ \text{if the flips are independent}$$

Hence almost surely, $\forall \omega \in \Omega$,

$$\omega \in \bigcup_{n=1}^{\infty} H_n$$

$\to \exists! \ q \in \mathbb N$ s.t. $\omega \in H_q$

Observe that $H_q \subseteq A_q$.

Hence, $$\omega \in A_q \subseteq \bigcup_{n=1}^{\infty} A_n \ QED$$

Or prove that $$\bigcup_{n=1}^{\infty} H_n = \bigcup_{n=1}^{\infty} A_n$$

(*) They are pairwise disjoint because:

$\forall m > n$,

$$\omega \in H_n \cap H_m$$

$$\to \omega \in H_n \cap \omega \in H_m$$

So, $$\omega \in H_n \to \omega \in A_n$$

However, $$\omega \in H_m \to \omega \in A_n^c ↯ \ QED$$

Another route:

$$P(\bigcup_{n=1}^{\infty} A_n) = 1 - P(\bigcap_{n=1}^{\infty} A_n^C)$$

$$= 1 - \prod_{n=1}^{\infty} P(A_n^C) \ \text{if the flips are independent}$$

$$= 1 - \prod_{n=1}^{\infty} (1/2)$$

$$= 1 - \lim_{m \to \infty} \prod_{n=1}^{m} (1/2)$$

$$= 1 - \lim_{m \to \infty} (1/2)^m = 1 - 0 = 1 \ QED$$

Yet another route:

$$\because \sum_{n=1}^{\infty} P(A_n) = \infty,$$

by Borel-Cantelli Lemma 2, if the flips are independent, we have $P(\limsup A_n) = 1$

Observe that $$\limsup A_n \subseteq \bigcup_{n=1}^{\infty} A_n$$

Hence, by monotonicity of probability, $$P(\bigcup_{n=1}^{\infty} A_n) = 1 \ QED$$

Or:

Hence, almost surely, $\forall \omega \in \Omega, \forall m \ge 1, \exists n \ge m$ s.t.

$$\omega \in A_n \subseteq \bigcup_{n=1}^{\infty} A_n$$

For Q2, let $B_{n,r}$ be a block of length r where

$$B_{n,r} = \bigcap_{i=n}^{n+r-1} A_i*$$

where $A_i* = A_i$ or $A_i^C$

$$\because \sum_{n=1}^{\infty} P(B_{n,r}) = \sum_{n=1}^{\infty} (\frac{1}{2^r}) = \infty,$$

using BCL2 again gives us

$$P(\limsup B_{n,r}) = 1$$

This means that almost surely $\forall \omega \in \Omega, \forall m \ge 1, \exists n \ge m$ s.t. $\omega \in B_{n,r} \ QED$

To see why the statement doesn't hold for a block of infinite length, define $$B_{n, \infty} := \lim_{r \to \infty} B_{n, r}$$

By the continuity of probability, $P(B_{n, \infty}) = \lim_{r \to \infty} P(B_{n, r}) = \lim_{r \to \infty} \frac{1}{2^r} = 0$

$$\because \sum_{n=1}^{\infty} P(B_{n,\infty}) = \sum_{n=1}^{\infty} 0 < \infty,$$

BCL1 gives us

$$P(\limsup B_{n,\infty}) = 0$$

For Q3, Murphy's Law is 'Anything that can go wrong, will go wrong', w/c is technically false:

Flipping 10 coins is a 'thing'. If we define 'go wrong' to be 'at least one head', then we may have 10 tails.

Mathematically,

$P(\bigcup_{n=1}^{10} E_n) \ne 1$ even if $P(E_n) > 0$

or

$P(\bigcup_{n=1}^{\infty} E_n) \ne 1$ where $E_k = \emptyset$ for $k \ge 11$ even if $P(E_n) > 0$ for $n = 1, 2, ..., 10$

Even if we flip infinitely, but $P(E_k) = 0$ for $k \ge 11$, we still may not have $P(\bigcup_{n=1}^{\infty} E_n) = 1$

Let us add the condition that it is not the case that all but a finite number of the $E_n$'s have zero probability (the $E_n$'s have positive probability infinitely often) to have Murphy's Law #2.

To put Murphy's Law #2 mathematically,

$$P(E_n) > 0 \text{i.o} \to P(\bigcup_n E_n) = 1$$

Is Murphy's Law #2 true? If not, what are some sufficient conditions for Murphy's Law #2?

Case 0: $\exists z \in \mathbb N s.t. P(E_z) = 1$

Obviously, Murphy's Law #2 holds.

Case 1: $E_n$'s are independent, $P(E_n) < 1$

$$P(\bigcup_{n=1}^{\infty} E_n) = 1 - P(\bigcap_{n=1}^{\infty} E_n^C)$$

$$= 1 - \prod_{n=1}^{\infty} P(E_n^C) = 1$$

Case 2: $E_n$'s are not independent but disjoint and $\sum_n P(E_n) = 1$, $P(E_n) < 1$

$$P(\bigcup_{n=1}^{\infty} E_n) = \sum_n P(E_n) = 1$$

Case 3: $E_n$'s are disjoint, not independent but $\sum_n P(E_n) = \infty$, $P(E_n) < 1$

Impossible.

Case 4: $E_n$'s are not independent but $\sum_n P(E_n) = \infty$, $P(E_n) < 1$

$$\sum_n P(E_n) = \infty to P(\limsup E_n) = 1 \to P(\bigcup_{n=1}^{\infty} E_n) = 1$$

Just kidding. BCL2 needs independence.

Case 5: $E_n$'s are not independent but $1 < \sum_n P(E_n) < \infty$, $P(E_n) < 1$

Here, we have $P(\liminf E_n^C) = 1$. So for some m, $\omega \in E_m^C, E_{m+1}^C, ...$.

$\omega$ may or may not be in $\bigcup_{n=1}^{m-1} E_n$.

So Murphy's Law #2 does not hold.

To sum up:

Case 0 is obvious. Case 1 corresponds to Q1. Case 2 corresponds to Q2. Case 3 is impossible. Cases 4 and 5 suggest counterexamples.

I like your rigorous style. In my proof, $A_{n}$ is the event that a tail appears $n-1$ times, followed by a head. — Quasar, Nov 21 '15 at 02:19
@Quasar Is it? You said that $A_n$ is the event where nth flip is heads, right? Hence $P(A_n) = 1/2$ not $\frac{1}{2^n}$ — BCLC, Nov 21 '15 at 02:20
I should have be more specific, especially when it comes to proability and math. I am getting back to math after a while and need to train myself to think mathematically. Is there a way I could add you as a friend here, or on Facebook? — Quasar, Nov 21 '15 at 02:22
@Quasar Weird. Didn't get notified about edit. You can use StackEye :)) Any particular reason why? I found you on linkedin. I guess I could message you — BCLC, Nov 22 '15 at 15:03
two quick things. 1) Could you elaborate in words, how you mathematically proved, that $H_{m}$ and $H_{n}$ are pair-wise disjoint. 2) I haven't yet studied Borel Cantelli Lemma, so your proof to the second part of the question went over the top of my head. — Quasar, Nov 24 '15 at 03:50
@Quasar 1 instead of n and M, try n and n+1. If omega is in H n+1, it is in A n+1 but not in A n. But if it is in Hn it is in An. Hence omega being in Hn and Hn+1 means it is both not in and in An. 2 I think you'll get to Borel Cantelli soon. it's an early topic in probability theory. I remember probability spaces then events then limsup liminf then random variables and distribution functions then independence then borel cantelli. Wait are you talking about Q2? The only borel Cantelli part is deducing prob of limsup. W/c part don't you understand? The *? — BCLC, Nov 24 '15 at 05:57

Probability that a head eventually turns up (From Grimmett and Stirzaker)

2 Answers2

Linked