Probability of typing a number such as $119$

Question

Consider a user typing numbers on a calculator (or $10$-digit keypad). The fact that he can enter any length of digits on the calculator suggests that eventually a given target will appear as consecutive digits pressed. I'd like to know how to quantify the chances.

Suppose for definiteness that the user makes 5000 key presses randomly of buttons ($0,1,2,3,4,5,6,7,8,9$) on the calculator. What is the chance that the user will press $1,1,9$ in a row, any place in that stream of digits?

could you add more details? as it is now try all possible numbers divided by 1 — Willemien, Oct 24 '15 at 09:30
The latest version of the problem can be solved by a Markov chain approach, in which random digits are pressed, one after the other until 5000 presses are completed. — hardmath, Oct 24 '15 at 13:18
@hardmath I didn't understand. How will I do that with Markov chain, say, for 100 key presses? — saqib shafin, Oct 24 '15 at 14:54
Your problem is of a type often treated with finite state probability transition matrices (aka Markov chains). Related: Probability of a substring occurring in a string. Your problem is somewhat simpler than that one. — hardmath, Oct 24 '15 at 17:22

hardmath · Accepted Answer · 2015-10-28T17:48:47.710

To compute the probability that anywhere in a sequence of 5000 random digits, the subsequence "...119..." appears (at least once), we formulate a Markov chain with one absorbing state, i.e. attaining a 119 subsequence.

With that in mind it should be fairly clear the transient states ought to be based on how many digits we have correct "so far":

State 0 We have no preceding correct digits.

State 1 We have one and only one preceding correct digit, i.e. the last digit was 1 but not the digit before that.

State 2 We have two (and only two) preceding correct digits, i.e. the last pair of digits were both 1.

State 3 We have reached the absorbing state, i.e. we already got 119.

Their respective transition probabilities are easily described:

Chance of going from state 0 to state 1 is $0.1$, and otherwise we stay in state 0 with probability $0.9$.

Chance of going from state 1 to state 2 is $0.1$, and otherwise we go back to state 0 with probability $0.9$.

Chance of going from state 2 to state 3 is $0.1$, of going from state 2 to state 2 is also $0.1$, and otherwise we go back to state 0 with probability $0.8$.

Finally, state 3 is "absorbing", meaning that once it is reached, we stay in state 3 with probability $1.0$.

Conventionally we assemble these transition probabilities into a $4\times 4$ matrix:

$$ M = \begin{bmatrix} 0.9 & 0.1 & 0 & 0 \\ 0.9 & 0 & 0.1 & 0 \\ 0.8 & 0 & 0.1 & 0.1 \\ 0 & 0 & 0 & 1.0 \end{bmatrix} $$

so that if the "current" vector of state probabilities is a row $v = [p_0, p_1, p_2, p_3]$, then the "next" state probabilities are $v M$.

At the outset, before any digits are "pressed", we are in state 0 with probability $1.0$. So let $v_0 = [1,0,0,0]$. After randomly pressing $N$ digits, the probability distribution among those four states is given by $v_N = v_0 M^N$.

It isn't all that difficult to compute $v_{100}$ or $v_{5000}$ by using binary exponentiation. If exact arithmetic were needed, the matrix $M$ could be scaled to integer entries, computing $v_N$ with extended integer precision.

Also we could omit the explicit computation of fourth entry $p_3$. It can be recovered implicitly from:

$$ p_3 = 1 - (p_0 + p_1 + p_2) $$

at any step of the process. Note that the fourth entry of $v_N$ is the cumulative probability of matching 119 by the $N$th step.

Example N=100

Let us truncate $M$ to the $3\times 3$ leading principal minor, as suggested above.

By exponentiating through an addition chain, using bluebit's Online Matrix Multiplication with $15$ digit precision, I get:

$$ M^{100} = \begin{bmatrix} 0.815694920176088 & 0.081651307201444 & 0.009082479364799 \\ 0.807521599731399 & 0.080833155363081 & 0.008991472283052 \\ 0.725870292529952 & 0.072659834918392 & 0.008082313362943 \end{bmatrix} $$

Multiplying $[1\; 0\; 0]\; M^{100}$ gives the top row of the above, and thus the probability of "119" occurring in the first one hundred digits is:

$$ p_3 = 1 - (p_0 + p_1 + p_2) = 1 - 0.906428706742331 = 0.093571293257669 $$

Example N=5000

Continuing exponentiation from the previous addition chain, I similarly get:

$$ M^{5000} = \begin{bmatrix} 0.005999910042978 & 0.000600592802508 & 0.000066806912496 \\ 0.005939790522546 & 0.000594574820405 & 0.000066137502537 \\ 0.005339197720037 & 0.000534455299972 & 0.000059450110473 \end{bmatrix} $$

Extracting the top row $[1\; 0\; 0]\; M^{5000}$ from the above, the probability of "119" occurring in the first five thousand digits is:

$$ p_3 = 1 - (p_0 + p_1 + p_2) = 1 - 0.006667309757982 = 0.993332690242018 $$

In other words, with so many additional key press repetitions we've gone from less than a $10\%$ chance of success to more than a $99\%$ chance.

Probability of typing a number such as $119$

1 Answers1