5

Suppose one flips a coin $N$ times, and every single time it comes out heads (H). For some sufficiently large $N = N_0$ (10?, 20?, 100?), even someone who knows nothing of probability theory would reach the conclusion that the coin is not fair1.

For the sake of this question, let's say that $N_0 = 20$.

Suppose that one wanted to use probability theory to justify the conclusion that the coin is not fair. How would one do it?

The argument I am familiar with goes something like this: the probability of getting $N_0 = 20$ heads out of $N_0 = 20$ flips of a fair coin is $(\frac{1}{2})^{N_0} = (\frac{1}{2})^{20} = 9.5\times 10^{-7}$. In other words, this result is too improbable, and therefore, we conclude that the coin must not be fair.

One problem with this argument, at least the way I worded it, is that it would apply to any sequence of 20 coin flips. For example, if the sequence of result had been the very respectable-looking HHTTHTTTHTHHTTTHTHTT instead, one could argue exactly as before: the probability of getting this sequence out of $N_0 = 20$ flips of a fair coin is $(\frac{1}{2})^{N_0} = (\frac{1}{2})^{20} = 9.5\times 10^{-7}$; this result is too improbable, and therefore, we conclude that the coin must not be fair.

Clearly, my reasoning, or at least its wording, can't be right, since it leads to the absurd conclusion that no coin can be fair.

What is the correct way to use probability theory to justify the conclusion that a coin that produces heads in every one of $N_0$ flips (for some sufficiently large $N_0$) must not be fair?


1Here and elsewhere in this post, strictly speaking, instead of "the coin is not fair", I should have written something like "it is very unlikely that the coin is fair," but I ultimately decided that this additional precision may end up derailing the discussion. If you feel that the wording I rejected is actually essential to reason properly through the situation, please feel free use it.

kjo
  • 14,334

2 Answers2

4

You can do this in a Bayesian framework. To keep things simple, suppose there are three kinds of coins: fair, always heads, and always tails. You can write down a prior probability distribution describing how likely you think your coin is to be one of these three types, knowing nothing about it; say you think there's a 90% prior probability the coin is fair, then 5% always heads, then 5% always tails.

Then you can update this prior using Bayes' theorem to a posterior probability distribution which reflects how likely you now think your coin is to be fair after flipping $20$ heads in a row. This calculation is easiest to explain in terms of odds ratios; the prior odds are $90 : 5 : 5$, or $18 : 1 : 1$, and the posterior odds are obtained by multiplying by the likelihood $1 : 2 : 0$ twenty times, which gives

$$18 : 2^{20} : 0$$

so the posterior probability that the coin is fair is now $\frac{18}{18 + 2^{20}} \approx 1.7 \times 10^{-5}$. If your prior probability that the coin is fair is higher then it takes more heads to convince you that the coin is likely not fair, which is pretty intuitive.

This simple prior is not capable of handling the case where you flip, say, $19$ heads, since that means you flip one tails so it excludes both the always-heads and always-tails situations. In that case you might suspect that the true bias of the coin is something like 95% heads. You can model this more complicated situation using a continuous prior, say a 90% prior probability that the coin is fair together with a 10% probability that the bias of the coin is uniformly random in $[0, 1]$. Then you can calculate the posterior probability distribution as before although the calculation requires some integrals.

Qiaochu Yuan
  • 419,620
  • 2
    You could further ask: suppose you flipped a sequence like 10 heads in a row, then 10 tails. You might still suspect something wonky is going on, but it's not wonkiness expressible in terms of the coin being biased: instead you might suspect that the coin flips are not independent. This would be a stranger situation but in principle could also be modeled by a more complicated prior over possible ways the coin flips could be being generated if they aren't independent. Taking this train of thought far enough gets you to what is called Solomonoff induction, which formalizes Occam's razor. – Qiaochu Yuan Jan 31 '23 at 23:08
  • 1
    Wikipedia link: https://en.wikipedia.org/wiki/Solomonoff%27s_theory_of_inductive_inference – Qiaochu Yuan Jan 31 '23 at 23:08
  • 1
    +1. Actually, the uniform prior on $[0,1]$ for the bias doesn't require any integrals – in that case all counts of heads/tails are equiprobable by symmetry – see How to prove the rule of succession without calculus?. – joriki Feb 01 '23 at 00:03
  • 1
    @joriki: ah, right, silly me. In fact I even gave this argument recently... https://math.stackexchange.com/questions/4621337/asking-for-an-intuitive-explanation-of-a-probability-problem/4621346#4621346 – Qiaochu Yuan Feb 01 '23 at 00:14
  • I see :-) That was actually a duplicate of this one: https://math.stackexchange.com/questions/3514574. I think we should close one as a duplicate of the other? – joriki Feb 01 '23 at 00:21
  • 1
    Sure, let's close the more recent one, I just voted. – Qiaochu Yuan Feb 01 '23 at 00:32
1

Suppose that one wanted to use probability theory to justify the conclusion that the coin is not fair. How would one do it?

The binomial test is well equipped to answer this question, since by construction the binomial distribution is order-agnostic. We model the number of heads $X$ as following a binomial distribution: $X \sim \text{Bin}(20, \pi)$. The binomial distribution models the number of successes in $n$ trials of an independent experiment, each with probability of success $\pi$.

Let the null hypothesis be $H_0 \colon \pi = 0.5$. Our expectation is that in tossing such a coin $20$ times, we will see $20 \cdot 0.5 = 10$ heads. We calculate the $p$-value for this test---roughly speaking, this is the probability that, if the coin really is fair, we'd see a result at least as extreme as the one we saw---and then use the size of the $p$-value to decide how we feel about whether the coin really was fair or not.

Since we saw $20$ heads in $20$ tosses, our $p$-value can be computed as follows:

\begin{align*} p &= \mathbf P[X \geq 20]\\ &= \sum_{k = 20}^{20} \mathbf P[X = k]\\ &= \sum_{k = 20}^{20} \binom{20}{k}\left(\frac{1}{2}\right)^k \left(\frac{1}{2}\right)^{20 - k}\\ &= \binom{20}{20}\left(\frac{1}{2}\right)^{20} \left(\frac{1}{2}\right)^{20 - 20}\\ &= \frac{1}{2^{20}} \end{align*}

So if the coin really were fair, the chance of obtaining the result we saw was roughly a one-in-a-million occurrence. If the coin is fair, this result is strictly speaking perfectly possible, but we may alternatively interpret the result as evidence that the coin is not actually fair.

It is common to set a magic threshold for the $p$-value at 0.05, so $p$-values below this level are considered "significant". This is arbitrary.

NB: This kind of question is really in the wheelhouse of the statistics folks at CrossValidated.

Novice
  • 4,094