3

This question relates to the YouTube video "How Lucky is Too Lucky?" by Matt Parker. In it, he poses the following question. Parker has published a list of $100$ tosses of a fair coin. A "malicious actor" wants to claim that in fact, Parker made the tosses up. He searches through the published tosses, and finds $12$ consecutive tosses comprising $2$ tails and $10$ heads. The probability of $2$ or fewer tails in $12$ tosses of a fair coin is less than $2\%$, so the actor claims that the tosses cannot possibly be legitimate.

Now Parker addresses the probability that such an anomalous string occurs. There are $5050=\binom{100}2+100$ sub-runs of length $1$ to $100$. What is the probability that at least one of them is unlikely? If we have $n$ tosses comprising $t$ heads and $h$ tails, then the run is unlikely if the probability of getting $t$ or fewer tails is $\leq.019$ or the probability of getting $h$ or fewer heads is $\leq.019$.

I should stress that the malicious actor does not choose $n$ in advance. He looks for an unlikely string of any length.

At about $16:25$ in the video, Parker says that the exact probability is $88.3\%$, and gives no indication how this number is arrived at. The problem of course, is that substrings overlap, so we don't have independent events.

Of course, he says throughout that he won't go into the technical details of the math, but I haven't figured out how this number was arrived at. It's easy to confirm by simulation, but I don't think Parker would have used the phrase "exact value" without a theoretical calculation to back it up.

$2^{100}$ is on the order of $10^{29}$ so generating and counting the admissible runs is infeasible. I've thought about trying to write a recurrence relation, but it seems hopeless, because there are too many possibilities. Usually, in this sort of problem, if $a_n$ is the number of admissible strings of length $n$, we have to break the recurrence up depending on the last characters of an admissible string of length $n-1$, $n-2$, and so on. There seem to be too many possibilities in this case.

An approximate calculation with a small error bound would be fine. Can you point me in the right direction?

Just in case I haven't described the problem comprehensibly, I append my simulation script:

from math import factorial
from random import choices

def choose(n,m): return factorial(n)//(factorial(m)*factorial(n-m))

epsilon = .019

critical = { } for n in range(6,101): prob = 0 mu = 2*(-n) for m in range(n+1): prob += choose(n,m)mu if prob > epsilon: critical[n] = m-1 break

def test(trials): success = 0 for _ in range(trials): flips = choices(range(2), k=100) success += anomalous(flips) return success/trials

def anomalous(flips): for m in range(6, 101): for s in range(101-m): run = flips[s:s+m] tails = run.count(0) if min(tails, m-tails) <= critical[m]: return True return False

saulspatz
  • 53,131
  • "... sub-runs of length 1 to 100. What is the probability that at least one of them is unlikely?" That (including all subsequences, even of length 1) does not make sense to me. Kindly tell me when a subsequence of length 1 should be consider "unlikely" ? – leonbloy Feb 16 '21 at 02:00
  • @leonbloy "the run is unlikely if the probability of getting or fewer tails is ≤.019 or the probability of getting ℎ or fewer heads is ≤.019." – saulspatz Feb 16 '21 at 02:01
  • Then, restricting to the runs of length 12, all the runs with $0,1,2, 10,11,12$ tails would qualify as "unlikely"? If so, the probability should be way higher than 88.3% – leonbloy Feb 16 '21 at 03:26
  • @leonbloy My simulation bears out Parker's claim very strongly. What evidence have you for your assertion? Also note that when Henry restricted the runs to length 12, he got a probability substantially lower than $88.3%$. – saulspatz Feb 16 '21 at 04:49

1 Answers1

1

My own attempt at simulation in R suggests a probability of about $0.41$ for the one-tailed version of $2$ or fewer tails in $12$ consecutive flips and about $0.68$ for the two tailed version, rather smaller than $88.3\%$. Testing it on $6$ flips, runs of $4$ and extremes of $1$ or fewer tails/heads, it come close enough to the exact $\frac{31}{64}$ one-tailed and $\frac{58}{64}$ two-tailed to suggest the larger simulation was reasonable.

simlookingforextreme <- function(totalflips, runlength, extreme){
  flips <- sample(c(0,1), totalflips, replace=TRUE)
  runs <- diff(c(0,cumsum(flips)), runlength)
  c(min(runs) <= extreme , max(runs) >= runlength - extreme)
  }

giving

set.seed(2021)
sims <- replicate(10^6, simlookingforextreme(100, 12, 2))
c(low=mean(sims[1,]), high=mean(sims[2,]), two=mean(sims[1,] | sims[2,]))
#      low     high      two 
# 0.414857 0.414524 0.677665

set.seed(1) sims <- replicate(10^6, simlookingforextreme(6, 4, 1)) c(low=mean(sims[1,]), high=mean(sims[2,]), two=mean(sims[1,] | sims[2,]))

low high two

0.484576 0.484273 0.906638

This encouraged me to try to check my simulations. There is an exact approach which does not count $2^{100}$ binary strings, but instead $2^{12}=4096$ such strings, spotting which are extreme, lopping off an end digit and sticking a new digit at the other end, and cycling through the remaining flips. Again in R

exactprobextreme <- function(totalflips, runlength, extreme, 
                             extremetail){ # "low", "high", "two"
  runscores <- 0
  for (i in 1:runlength){ 
    runscores <- c(runscores, runscores+1)
    }  
  scoreprobs <- rep(1/2^runlength, 2^runlength)
  extremeprob <- 0
  for (n in runlength:totalflips){
    if(extremetail == "low" | extremetail == "two"){
      extremeprob <- extremeprob + 
                     sum(scoreprobs[runscores <= extreme])
      scoreprobs[runscores <= extreme] <- 0
      }
    if(extremetail == "high" | extremetail == "two"){
      extremeprob <- extremeprob + 
                     sum(scoreprobs[runscores >= runlength - extreme])
      scoreprobs[runscores >= runlength - extreme] <- 0
      }      
    scoreprobs[1:(2^(runlength-1))] <- (
       scoreprobs[2*(1:(2^(runlength-1)))] + 
       scoreprobs[2*(1:(2^(runlength-1)))-1] )/2
    scoreprobs[(2^(runlength-1)+1):(2^runlength)] <- 
        scoreprobs[1:(2^(runlength-1))] 
    }
    extremeprob
  }

giving (faster than the simulation)

exactprobextreme(totalflips=100,runlength=12,extreme=2, extremetail="low")
# 0.4145669
exactprobextreme(totalflips=100,runlength=12,extreme=2, extremetail="high")
# 0.4145669
exactprobextreme(totalflips=100,runlength=12,extreme=2, extremetail="two")
# 0.6770409
exactprobextreme(totalflips=6,  runlength=4, extreme=1, extremetail="low")
# 0.484375
exactprobextreme(totalflips=6,  runlength=4, extreme=1, extremetail="high")
# 0.484375
exactprobextreme(totalflips=6,  runlength=4, extreme=1, extremetail="two")
# 0.90625

strengthening my opinion that the $88.3\%$ should be $67.7\%$.

Or maybe I have misunderstood. Looking at the video after writing this and then rereading your question, it seems that there did not to have been run lengths of $12$, and any unlikely runs of any length could have been considered.

Henry
  • 157,058
  • Or maybe I have misunderstood. Looking at the video, is seems that there did not to have been run lengths of $12$, and any unlikely runs of any length could have been considered. – Henry Feb 16 '21 at 00:51
  • I don't know R, but it seems to me that you are only considering the event of two or fewer heads or tails in a run of $12$. As I tried to explain in the question, Parker considers the possibility of finding a run of any length $n$ where the probability of having no more occurrences than those of the scarcer side is less than $.019.$ (The code at the beginning of my script calculate this critical value for $n=6...100$.) I'll try to make this more explicit in my question. – saulspatz Feb 16 '21 at 00:55
  • @saulspatz - indeed I spotted that point and edited my answer just before your comment. If you have confirmed his value by simulation, then I suspect he may also have used a large enough simulation to get close enough to a particular value to be confident that it is exact to $3$ significant figures – Henry Feb 16 '21 at 00:58
  • That hadn't occurred to me. Thanks. – saulspatz Feb 16 '21 at 01:04