7

Question refers to $P$ of getting lead of at least $10$ at any point during a game, with game lasting only $300$ flips.

I tried to apply the formula given in an answer to a similar question here

$P(X_n≥ (n+10/2))$

but that works out to summing $300 + 10$, and dividing by $2$ which = $155$.

Following from that previous answer, it would seem that the $P$ of a lead of $10$ or greater at some point in a $300$ flip Game would be only $.155$ ...and that does not fit with the results when I simply use a simulator to repeatedly sample games of $300$ flips and count the times that a lead of at least 10 appears.

Is there something wrong in the way I am applying this formula? or is wrong for the question I am asking?

Pseudoego
  • 191
  • 1
    Don't think I have time to do this out in full but the key is to consider this as a first passage time problem. The probability that the random walk does not go above 10 is the same as the probability that the first passage time to 10 is greater than 300. So if you know or can find the first passage time distribution for a simple random walk (see eg Feller) then you have an answer. – spaceisdarkgreen Jan 11 '17 at 01:31
  • Just looking at the old question, it seems to only consider the probability of getting $10$ more Heads than Tails, whereas you consider both cases. Roughly speaking you can double the answer (very rough since you are double counting the scenarios in which both $H$ and $T$ build up big leads at some point). – lulu Jan 11 '17 at 01:34
  • Mind you, I'm not sure I follow the argument given for that question...the author appears to correctly compute the probability that $H-T≥10$ after the $n^{th}$ trial but that's not enough. After all, they could be tied after $300$ but $H$ could have gone up by a lot at some point. Of course, I am just reading it quickly and might be missing the point. – lulu Jan 11 '17 at 01:38
  • The formula (from the linked post) that you are using is for the probability of having a $\ge 10$ lead after the $n$th flip, not at any point during the game. – angryavian Jan 11 '17 at 01:38
  • Yes, I don't care when the Lead of 10 occurs, could be at any point, and I don't think the original question posited any specific Team...it referred to any team achieving a lead of 10 games at some point in the season. – Pseudoego Jan 11 '17 at 01:41
  • It was reading Feller that prompted me to ask this kind of question, but he is way over my head with his formulae and I am intellectually challenged as far as anything beyond simple Probability Arithmetic goes....which is why I thought the original Post was something within my reach. – Pseudoego Jan 11 '17 at 01:46
  • Antoni Parellada: I am simply simulating 300 flips at a time, and counting how often either Heads or Tails gets ahead by at least 10. It is well over 50% of the time, based on a limited and tiresome to collect sample. I am using this page: http://syzygy.virtualave.net/multicointoss.htm – Pseudoego Jan 11 '17 at 02:09
  • Also, you are reading the old question wrong. You need to use the normal distribution to get the probability that the total number of Heads at the end exceeds $155$, that gives $0.281851431$. – lulu Jan 11 '17 at 02:14
  • Worth noting; the reflection principle tells us that the probability that $H's$ lead gets to $10$ at some point during the trial is twice the probability that it ends up over $10$...hence that is $2\times 0.281851431=0.563702862$. Accordingly, I think your answer should be quite high. – lulu Jan 11 '17 at 02:18
  • @AntoniParellada Your answer looks too low. Are you getting a probability of $.02$? so practically $0$? But the probability that $H-T>10$ at the end of $300$ trials is the probability that you have at least $155$ Heads and the normal approximation already gives that at around $.28$...clearly the answer here is a lot higher than that. – lulu Jan 11 '17 at 02:22
  • lulu- That is what I am seeing by simply repeatedly sampling 300 coin tosses: well over 50% of the time one side or the other leads by 10 or more at some point before the end. – Pseudoego Jan 11 '17 at 02:22
  • Ok, but my simulation is coming out to nearly $1$. I mean, I just about always get one lead or the other by at least $10$. Mind you, I just threw the code together...bugs are more than likely. – lulu Jan 11 '17 at 02:23
  • @lulu - to be honest, I am seeing at least 90% of the time a lead of 10 or more...but I did not want to say that for fear of being laughed at. :-) – Pseudoego Jan 11 '17 at 02:25
  • No fears...I think $>.9$ is correct. – lulu Jan 11 '17 at 02:25
  • @AntoniParellada So now your answer is $1$? That's a lot closer to what I believe. – lulu Jan 11 '17 at 02:26
  • Worth remarking: the reflection principle works wonderfully if you have one barrier...like I say, it gives $.56$ if you just want $H$ to get a lead. Two barriers is more painful. – lulu Jan 11 '17 at 02:27
  • @AntoniParellada I still don't have a proof though. Options theory has ways of dealing with two barriers...but it's painful. Back of the envelope says it is $2\times .56 - P(both)$. And I think the probability that both barriers get touched is not that high...hence my confidence that the answer is fairly near $1$ (supported by simulation). – lulu Jan 11 '17 at 02:30
  • @Antoni Parellada - Yeah, funny how hindsight is often 20/20! – Pseudoego Jan 11 '17 at 02:31
  • more back of envelope work...the probability that $H's$ lead exceeds $20$ at the end is about $.12$ so the probability that $H's$ lead gets to $20$ at some point is about $.24$. Therefore given that $H$ gets a lead of $10$, the probability that $T$ can claw back to a lead of $10$ is considerably less than $.24$ (because it will have a lot less time to do it). Thus our answer here is greater than $2\times .56 - .24 = .88$. – lulu Jan 11 '17 at 02:38
  • @Antoni : the next question becomes how long the game must continue in order for there to be P > 90% of seeing a Lead of at least 10? I am guesstimating 200 flips. – Pseudoego Jan 11 '17 at 02:41
  • @lulu : I am not clear that requiring the other side to "claw back" after the other side has reached the barrier is the same Question/Problem. The idea is simply that either side will at any point reach 10. Not both, unless I am misreading you. – Pseudoego Jan 11 '17 at 02:51
  • my point: I am trying to do a rough estimate of the probability. I want to argue that the probability that you get both a Heads lead of $10$ and a Tails lead of $10$ is not too big. To support that claim, I point out that to get first the H lead and then the T lead requires Tails to claw back $20$ and that's not easy. It's a handwaving argument, for sure. But I am trying to avoid the ugly integrals. – lulu Jan 11 '17 at 02:55
  • @lulu thanks for all of your help! – Pseudoego Jan 11 '17 at 04:58
  • @Antoni Parellada - Thank you for persevering with your simulations. – Pseudoego Jan 11 '17 at 04:59
  • So, what happened to @Antoni Parellada's Extended Comment and results of his simulation? Was it deleted by him or someone else? There was some confusion about what a "Lead" is. Please inspect this screenshot to see a simulation which shows a Lead of 10 Heads after 30 Coin Flips. http://prntscr.com/du509r Spacedarkgreen suggested that the distribution of trials/flips/votes until the first passage of 10 is what we are talking about, and i tend to agree, but I lack the information about how to calculate that. – Pseudoego Jan 11 '17 at 07:09
  • @spaceisdarkgreen if you have the time, could you elaborate more on this solution? if we let $T = \inf{n \geq 0 : X_n = 10}$ where ${X_n}$ is a random walk, we restate the problem as finding the probability that $T \leq 300$. How can we find the distribution on this stopping time? – Daniel Xiang Jan 11 '17 at 07:21
  • @lulu Aside from careless errors in the code, my take on the question included consecutive runs. This is not what was intended in the OP. In any event, I think this could be a simulation: set.seed(0); a = replicate(10^5, sample(c(-1, 1), 300, replace = T)); momentary_score = apply(a, 2, function(x) cumsum(x)); mean(apply(momentary_score, 2, function(x) sum(abs(x) > 9) > 0)) [1] 0.97006. $97%.$ – Antoni Parellada Jan 11 '17 at 07:41
  • @DanielXiang Maybe tomorrow. I'm surprised it hasn't been solved given the length of this comment thread. I'd add that I thought it was the probability of heads getting the lead (not heads or tails) so I interpreted it as the standard first passage problem that has a widely-known solution. Two barriers I wouldn't be as sure where to look up. – spaceisdarkgreen Jan 11 '17 at 08:05
  • @spaceisdarkgreen If you look at it as a variant of the stereotypical Polling Question, which Feller suggests, surely it does not matter which of Two candidates gets the lead....the question is how often will one of them jump out to a 10 vote lead - at any point - during the first 300 votes, assuming support is split roughly 50-50. Any clarity you could bring about how to calculate this kind of thing is appreciated...especially if it can be extrapolated to varying Ns of Samples and heights of barriers. – Pseudoego Jan 11 '17 at 10:14
  • @Pseudoego I thought the question was the probability of one of them getting the lead, not how often. At least that's what is says at the top. – spaceisdarkgreen Jan 12 '17 at 04:50
  • @Pseudoego But originally I even misread the top and thought you meant the probability of heads getting the lead. For that I suggested to use the first passage time distribution for a simple RW to 10. The probability of heads not ever getting the lead in 300 flips is the same as probability the first passage time to 10 is more than 300 This lecture notes derives it from the reflection principle at the top of page 7 http://galton.uchicago.edu/~lalley/Courses/312/RW.pdf – spaceisdarkgreen Jan 12 '17 at 04:55
  • @Pseudoego They get $P(\tau(m)>n) = 1 - P(S_n=m)-2P(S_n>m)$ where $m$ is 10 and $n$ is 300. The two things can be computed from the binomial distribution. However, even though there's symmetry this won't give you the probability of either heads or tails getting the lead by 10. For that you'd also need to know the probability of both of them getting the lead at some point. – spaceisdarkgreen Jan 12 '17 at 04:59
  • @Pseudoego Do you mean that the 10-point-lead, once gained at any point, should be maintained until the end? – massimo Jan 12 '17 at 06:42
  • @massimo - No, I stated several times that it does not matter who gets the lead, when it is obtained, or whether it is maintained. I simply wish to know how to find out the Probability that one side or the other will get 10 wins ahead of the other at any point in the game/vote on at least ONE occasion. So, yeah, people are correct who say that the answer can be inferred by calculating the Probability that a Random Walk never reaches 10 during 300 flips, and then subtracting that from P=1.0 – Pseudoego Jan 12 '17 at 07:02

1 Answers1

2

With this problem it is a challenge already to provide a numerical answer that can be used to check the results from probabilistic methods. This can in fact be done, and I will show how. Suppose we have $n$ flips and we are looking to count outcomes where a lead of at least $q$ was obtained at some point. The idea is to use a Markov chain with states $T$ and $A_p$ where $-(q-1)\le p\le q-1.$ The state $A_p$ represents the lead $p$ with the obvious transition rules that this implies. Finally $T$ is an absorbing state where the chain remains once a lead of $q$ has been seen. We solve this system of equations and obtain $T.$ We get for the present problem which has $q=10$

$$\bbox[5px,border:2px solid #00A000]{ T(z) = {\frac {2{z}^{10}}{ \left( 2\,{z}^{10}-25\,{z}^{8} +50\,{z}^{6}-35\,{z}^{4}+10\,{z}^{2}-1 \right) \left( 2\,z-1 \right) }}.}$$

It remains to compute

$$\frac{[z^{300}] T(z)}{2^{300}}.$$

Extracting the coefficient we get ${ 1.9744763278096917789\times 10^{90}}$

which yields for the probability of having seen a lead of at least ten at some point during $300$ flips the value

$$\bbox[5px,border:2px solid #00A000]{ 0.96928888382356097067.}$$

Observe that we used the Maple series command to extract the coefficient. This can be replaced if desired by converting $T(z)$ numerically into a partial fraction decomposition and computing the coefficients from a geometric series (I have tested this).

The Maple code for this including an enumeration routine to check the result from the Markov chain, is as follows. We can of course solve for $T$ manually but here it has retained the format from the system of equations.

X :=
proc(q)
    option remember;
    local sys, pos, sol, eq;

    sys := [A[-(q-1)] = z * A[-(q-2)],
            A[q-1] = z * A[q-2]];

    for pos from -(q-2) to q-2 do
        sys :=
        [op(sys),
         A[pos] + `if`(pos=0, -1, 0) =
         z * A[pos-1] + z * A[pos+1]];
    od;

    sol := solve(sys, [seq(A[p], p=-(q-1)..q-1)]);

    eq := T = 2*z*T +
    z * subs(op(1, sol), A[-(q-1)] + A[q-1]);
    solve(eq, T);
end;

Q := (n, q) -> 
coeff(series(X(q), z=0, n+1), z, n);


ENUM :=
proc(n, q)
    option remember;
    local ind, res, lead, d, pos;

    res := 0;
    for ind from 2^n to 2^(n+1)-1 do
        d := convert(ind, base, 2);

        lead := 0;

        for pos to n do
            if d[pos] = 1 then
                lead := lead + 1;
            else
                lead := lead - 1;
            fi;

            if lead = q or lead = -q then
                break;
            fi;
        od;

        if pos < n+1 then
            res := res + 1;
        fi;
    od;

    res;
end;

This method is computation intensive. We hope to see the numerics verified by a future post.

What we have here is closely related to the DFA method.

There is, among others, this entry at the OEIS, OEIS A216212.

Marko Riedel
  • 61,317
  • Thanks for the answer, but I am not understanding how we get from multiplying 1.9744763278096917789 × (10 to the 90th power) to a Probability of 0.96928888382356097067. Shouldn't the product of that multiplication be a huge number? – Pseudoego Jan 12 '17 at 06:24
  • +1 The final result coincides with the simulation set.seed(0); a = replicate(10^5, sample(c(-1, 1), 300, replace = T)); momentary_score = apply(a, 2, function(x) cumsum(x)); mean(apply(momentary_score, 2, function(x) sum(abs(x) > 9) > 0)) [1] 0.97006. – Antoni Parellada Jan 12 '17 at 17:59