Probability of tossing a fair coin with at least $k$ consecutive heads

Question

Tossing a fair coin for $N$ times and we get a result series as $HTHTHHTT\dots~$, Here '$H$' denotes 'head' and '$T$' denotes 'tail' for a specific tossing each time.

What is the probability that the length of the longest streak of consecutive heads is greater than or equal to $k$? (that is we have a $HHHH\dots~$, which is the substring of our tossing result, and whose length is greater than or equal to $k$)

I came up with a recursive solution (though not quite sure), but cannot find a closed form solution.

Here is my solution.

Denote $P(N,k)$ as the probability for tossing the coin $N$ times, and the longest continuous heads is greater or equal than $k$. Then (For $N>k$)

$$ P(N,k)=P(N-1,k)+\Big(1-P(N-k-1,k)\Big)\left(\frac{1}{2}\right)^{k+1} $$

Your example sequence reminded me of this brilliant Simpsons scene :-) — joriki, Nov 10 '12 at 08:58
I'm getting $(N-k+1)/2^k$ for the closed form. Would you like for me to post my solution, or do you want to think about it some more before I spoil the beans? — Braindead, Nov 10 '12 at 08:59
I tried to clarify some of the formulations; please check whether I preserved the intended meaning. In particular, I assumed that you had merely accidentally written "greater" once instead of "greater or equal". — joriki, Nov 10 '12 at 09:12
@joriki, yes, you've preserved the intended meaning. 'at least' should mean greater or equal — Benson, Nov 10 '12 at 09:16
The approach I took was to divide into cases where the exact number of heads is known. Supposing that there is a $k$ block of heads in $N$ tosses, the number of heads can be exactly $k$, $k+1$,..., $N$. In my first failed attempt at the solution, I overcounted the number of possible arrangements for $k$-block with $r$ heads. I fixed this, and right now I'm checking to see if I overlooked anything... — Braindead, Nov 10 '12 at 09:29
Can't this be simplified to $P(N,k)=P(N-1,k)+\frac{1}{2}·P(N-1,k-1)$? — GregRos, Nov 10 '12 at 13:09

Markus Scheuer · Answer 1 · 2016-01-19T07:52:23.093

We can derive an explicit formula of the probability $P(N,k)$ based upon the Goulden-Jackson Cluster Method.

We consider the set of words $\mathcal{V}^{\star}$ of length $N\geq 0$ built from an alphabet $$\mathcal{V}=\{H,T\}$$ and the bad word $\underbrace{HH\ldots H}_{k \text{ elements }}=:H^k$ which is not allowed to be part of the words we are looking for. We derive a function $f_k(s)$ with the coefficient of $s^N$ being the number of wanted words of length $N$.

The wanted probability $P(N,k)$ can then be written as \begin{align*} P(N,k)=1-\frac{1}{2^N}[s^N]f_k(s) \end{align*}

According to the paper (p.7) from Goulden and Jackson the generating function $f_k(s)$ is \begin{align*} f_k(s)=\frac{1}{1-ds-\text{weight}(\mathcal{C})}\tag{1} \end{align*} with $d=|\mathcal{V}|=2$, the size of the alphabet and with $\mathcal{C}$ the weight-numerator with \begin{align*} \text{weight}(\mathcal{C})=\text{weight}(\mathcal{C}[H^k]) \end{align*} We calculate according to the paper \begin{align*} \text{weight}(\mathcal{C}[H^k])&=-\frac{s^k}{1+s+\cdots+s^{k-1}}=-\frac{s^k(1-s)}{1-s^k} \end{align*}

We obtain the generating function $f(s)$ for the words built from $\{H,T\}$ which don't contain the substring $H^k$ \begin{align*} f_k(s)&=\frac{1}{1-2s+\frac{s^k(1-s)}{1-s^k}}\\ &=\frac{1-s^k}{1-2s+s^{k+1}}\tag{2}\\ \end{align*}

$$ $$

Note: For $k=2$ we obtain \begin{align*} f_2(s)&=\frac{1-s^2}{1-2s+s^{3}}\\ &=1+2s+3s^2+5s^3+8s^4+13s^5+21s^6+34s^7+\mathcal{O}(s^8) \end{align*}

The coefficients of $f_2(s)$ are a shifted variant of the Fibonacci numbers stored as A000045 in OEIS.

Note: For $k=3$ we obtain \begin{align*} f_3(s)&=\frac{1-s^3}{1-2s+s^{4}}\\ &=1+2s+4s^2+7s^3+13s^4+24s^5+44s^6+81s^7+\mathcal{O}(s^8) \end{align*}

The coefficients of $f_3(s)$ are a shifted variant of the so-called Tribonacci numbers stored as A000073 in OEIS.

$$ $$

We use the series representation of $f_k(s)$ in (2) to derive an explicit formula of the coefficients.

\begin{align*} [s^N]f(s)&=[s^N](1-s^k)\sum_{m=0}^{\infty}(2s-s^{k+1})^m\\ &=[s^N](1-s^k)\sum_{m=0}^{\infty}s^m(2-s^k)^m\\ &=[s^N](1-s^k)\sum_{m=0}^{\infty}s^m\sum_{j=0}^m\binom{m}{j}(-1)^js^{kj}2^{m-j}\\ &=([s^N]-[s^{N-k}])\sum_{m=0}^{\infty}s^m\sum_{j=0}^m\binom{m}{j}(-1)^js^{kj}2^{m-j}\tag{3}\\ &=\sum_{m=0}^{N}([s^{N-m}]-[s^{N-k-m}])\sum_{j=0}^m\binom{m}{j}(-1)^js^{kj}2^{m-j}\tag{4}\\ &=\sum_{{m=0}\atop{m\equiv N(\bmod k)}}^{N}\binom{m}{\frac{N-m}{k}}(-1)^{\frac{N-m}{k}}2^{m-\frac{N-m}{k}}\\ &\qquad-\sum_{{m=0}\atop{m\equiv N(\bmod k)}}^{N-k}\binom{m}{\frac{N-m}{k}-1}(-1)^{\frac{N-m}{k}-1}2^{m-\frac{N-m}{k}+1}\\ &=\sum_{{m=0}\atop{m\equiv N(\bmod k)}}^{k-1}\binom{m}{\frac{N-m}{k}}(-1)^{\frac{N-m}{k}}2^{m-\frac{N-m}{k}}\tag{5}\\ &\qquad+\sum_{{m=k}\atop{m\equiv N(\bmod k)}}^{N-k} \left(\binom{m}{\frac{N-m}{k}}-\frac{1}{2^k}\binom{m-k}{\frac{N-m}{k}}\right)(-1)^{\frac{N-m}{k}}2^{m-\frac{N-m}{k}}\\ \end{align*}

Comment:

In (3) we use the linearity of the coefficient of operator and $[s^N]s^mf(s)=[s^{N-m}]f(s)$
In (4) we change the limit of the left hand sum from $\infty$ to $N$ according to the maximum coefficient $[s^N]$. According to the factors $s^{kj}$ we consider in the following only summands with $m\equiv N(\bmod k)$
In (5) we reorganise the sums from the line above by extracting from the left hand sum the first summand and shifting in the right hand side the index by one and putting both sums together.

We conclude: An explicit representation of the probability $P(N,k)$ $(n\geq 0)$ is according to (5) \begin{align*} P(N,k)&=1-\sum_{{m=0}\atop{m\equiv N(\bmod k)}}^{k-1}\binom{m}{\frac{N-m}{k}}(-1)^{\frac{N-m}{k}}2^{-\frac{(k+1)(N-m)}{k}}\\ &\qquad-\sum_{{m=k}\atop{m\equiv N(\bmod k)}}^{N-k} \left(\binom{m}{\frac{N-m}{k}}-\frac{1}{2^k}\binom{m-k}{\frac{N-m}{k}}\right)(-1)^{\frac{N-m}{k}}2^{-\frac{(k+1)(N-m)}{k}} \end{align*}

score 5 · Accepted Answer · answered Nov 10 '12 at 12:22

Your recurrence relation is correct. I don't think you can do much better than that for general $k$, but you can find a closed form for specific values of $k$. For the first non-trivial value of $k$, the recurrence relation is

$$ p_n=p_{n-1}+(1-p_{n-3})/8\;. $$

With $p_n=1+\lambda^n$, the characteristic equation becomes $\lambda^3-\lambda^2+1/8=0$. One solution, $\lambda=1/2$, can be guessed, and then factoring yields $(\lambda-1/2)(\lambda^2-\lambda/2-1/4)$ with the further solutions $\lambda=(1\pm\sqrt5)/4$. Thus the general solution is

$$p_n=1+c_1\left(\frac12\right)^n+c_2\left(\frac{1+\sqrt5}4\right)^n+c_3\left(\frac{1-\sqrt5}4\right)^n\;.$$

The initial conditions $p_0=0$, $p_1=0$, $p_2=1/4$ determine $c_1=0$, $c_2=-(1+3/\sqrt5)/2$ and $c_3=-(1-3/\sqrt5)/2$, so the probability is

$$ \begin{align} p_n &=1-\frac{1+3/\sqrt5}2\left(\frac{1+\sqrt5}4\right)^n-\frac{1-3/\sqrt5}2\left(\frac{1-\sqrt5}4\right)^n\\ &=1-\frac4{\sqrt5}\left(\left(\frac{1+\sqrt5}4\right)^{n+2}-\left(\frac{1-\sqrt5}4\right)^{n+2}\right)\;. \end{align} $$

Thus, for large $n$ the probability approaches $1$ geometrically with ratio $(1+\sqrt5)/4\approx0.809$.

I've added an explicit formula for the $P(N,k)$ which might be of interest to you. Note, that your $p_n=1-\frac{1}{2^n}F_{n+2}$ with $F_n$ the Fibonacci numbers. Regards, — Markus Scheuer, Jan 18 '16 at 21:40

awkward · Answer 3 · 2017-04-02T14:26:41.600

Feller considers this problem in section XIII.7 of An Introduction to Probability Theory and Its Applications, Volume 1, Third Edition. He shows that the probability of having no run of length $k$ in $N$ throws is asymptotic to $$\frac{1-(1/2)\;x}{(k + 1 - kx)\; (1/2)} \cdot \frac{1}{x^{N+1}}$$ where $x$ is the least positive root of $$1 - x +(1/2)^{k+1} x^{k+1} = 0$$

(I have changed Feller's notation to agree with the problem statement and have only considered the case of a fair coin; Feller considers the more general case of a biased coin. For more information see equation 7.11, p. 325 in the referenced document.)

mathreadler · Answer 4 · 2017-11-22T04:16:56.100

A bit more practical focus than previous answers:

Main idea:

We build a "staircase":
$p$ chance to get to next step, $1-p$ chance to fall down and have to restart (or "reflip"). Except for the highest step where we always stay (whenever we manage to get there).
We need $k+1$ steps to "remember" where on the stairs we are for $k$ flips in a row.

We can use this to build a matrix for a Markov chain:

We can build a block matrix:$$\frac{1}{2}\left[\begin{array}{cc}2&1&\bf 0^T\\\bf 0&\bf0&\bf I_k\\0&1&\bf 1^T \end{array}\right]$$

(For the special case $k = 5$). We build a stochastic matrix:

$${\bf P} = \frac{1}{2}\left[\begin{array}{cccccc}2&1&0&0&0&0\\0&0&1&0&0&0\\0&0&0&1&0&0\\0&0&0&0&1&0\\0&0&0&0&0&1\\0&1&1&1&1&1\end{array}\right]$$

Now our answer will simply be $$[1,0,{\bf 0}]\, {\bf P}^k\, [{\bf 0},0,1]^T$$

Which is a scalar product using ${\bf P}^k$ as a gramian matrix.

Now to calculate this in practice one could probably in general (for general $p$) benefit from a canonical transformation of $\bf P$, but it's doubtable in this case as the matrix is already sparse with literally only sums and permuations and bit shifts required on the vector elements.

score 0 · Answer 5 · answered Jan 25 '23 at 23:16

One solution uses a Markov chain. Let state $i=0,1,\dots,k-1$ be the state that we've never had any sequence of $k$ tails, and currently ending in $i$ consecutive tails. Let state $k$ be the state when we have already had $k$ tails in a row somewhere.

Then the transition matrix is $(k+1)\times(k+1)$:

$$M_k=\begin{pmatrix} 1/2&1/2&0&0&\cdots&0&0\\ 1/2&0&1/2&0&\cdots&0&0\\ 1/2&0&0&1/2&\cdots&0&0\\ \vdots&\,&&&&&\vdots\\ 1/2&0&0&0&\cdots&1/2&0\\ 1/2&0&0&0&\cdots&0&1/2\\ 0&0&0&0&\cdots&0&1 \end{pmatrix} $$

Then the probability you want is:

$$p_{N,k}=\begin{pmatrix}1&0&0&\cdots&0&0\end{pmatrix}M_k^N\begin{pmatrix}0\\\vdots\\0\\1\end{pmatrix}$$

We can diagonalize $M_k$ to get get $p_{N,k}$ in terms of the eigenvalues of $M_k,$ but in general we can't get nice closed forms for those eigenvalues or the related constants. We can compute this value exactly by "repeated squaring," taking $O(\log_2 N)$ matrix multiplications. The characteristic polynomial for $M_k$ might let us avoid a lot of Matrix multiplications, if $N$ is much larger than $k.$

$1$ is the largest eigenvalue, and the second highest real eigenvalue will be a value $\lambda_k$ which dominates the other terms, so you will get something like:

$$p_{N,k}\sim 1-c_k\lambda_k^N$$ for some $c_k$ constant. $\lambda_k$ is half of the largest real rook to:

$$x^k-(x^{k-1}+x^{k-2}+\cdots+x+1),$$ but it is a little less obvious what $c_k$ is.

Probability of tossing a fair coin with at least $k$ consecutive heads

5 Answers5

Linked

Related