1

(I posted this on CV but I think here would be faster.

After I get the answer, I will somehow merge two posts...)

I am curious about the distribution of (maximum) run length given

k independent trials when $p(X=1)=p_1, p(X=2)=p_2, ..., p(X=n)=p_n.$

For example, for a coin tossing for 3 independent trials,
$p(X="H")=1/2, p(X="T")=1/2.$

$p(mrl=3)=2*(1/2)^3 \mbox{ for } HHH, TTT $

$p(mrl=2)=4*(1/2)^3 \mbox{ for } HTT, THH, HHT, TTH $

$p(mrl=1)=1-p(mrl=3)-p(mrl=2) $.

But what if for general n and k?

My guess would be $E(mrl)=\log_k n$ for uniform distribution

A similar question

Probability for the length of the longest run in $n$ Bernoulli trials

Given the link and the answer in it, I figure this problem is solved, Coz we can define $X=i$ as success and X otherwise failure. So to calculate $p(mrl=k)$ we can use different probability Bernoulli trials.

But then again, I am wrong because there could be longer run in FAILURES... Let X=1 be success, series of failures could be 23233432 but also could be 2222222!

KH Kim
  • 225

1 Answers1

4

In the uniform case, let $L_k$ denote the number of trials before $k$ with the same result as trial $k$, then $(L_k)_{k\geqslant1}$ is a Markov chain starting from $L_1=1$ whose transition probabilities are such that $\ell\to\ell+1$ with probability $p=\frac1n$ and $\ell\to1$ with probability $q=1-p$, for every integer $\ell\geqslant1$. The maximum run length $M_k$ after $k$ trials is such that $$[M_k\geqslant m]=[T_m\leqslant k],$$ for every $m\geqslant1$, where $$T_m=\inf\{i\geqslant1\mid L_i=m\}.$$ The distribution of the hitting times $T_m$ is well-known since the usual one-step decomposition of generating functions yields, for every $|s|\leqslant1$, $$E(s^{T_m})=\frac{s(ps)^{m-1}(1-ps)}{1-s+qs(ps)^{m-1}}.$$ This, in turn, is enough to show that $qp^{m-1}T_m$ converges in distribution to the standard exponential distribution. This convergence and the duality identity recalled above show that $M_k/\log_n(k)\to1$ in probability, and this last convergence can be refined to show that $E(M_k)/\log_n(k)\to1$.

In the non uniform case, $E(M_k)/\log k\to1/\log\varrho$, where $1/\varrho=\max\{p_i\mid 1\leqslant i\leqslant n\}$. The idea is that with overwhelming probability, the longest run is a run of some state $\nu$ such that $p_\nu=1/\varrho$ hence its length is similar to the length of the longest run in the uniform case with $\varrho$ states.

Edit: To find $E(s^{T_m})$ in the uniform case, one considers $u_i$ the generating function of the hitting time of $m$ by the Markov chain starting at $i$. Thus, $u_i=s(pu_{i+1}+qu_1)$ for every $1\leqslant i\leqslant m-1$ and $u_m=1$. Solving this Cramér system yields the value of $E(s^{T_m})=su_1$ given above.

Did
  • 279,727