3

What is the probability that a 75% free throw shooter, given the assumptions listed below, can make at least $5$ in a row of $10$ shots? So in effect he must make $5$, $6$, $7$, $8$, $9$, or all $10$ in a row. He only gets $10$ shots total.

The main difference of this question vs. others I've seen on this site is that mine asks for "in a row" which changes the math needed to solve it. I can solve this using placeholders using a divide and conquer method but I was told a recurrence relation can be used so I would like to see that solution method please.

Some assumptions:

A) The 75% free throw (f.t.) percentage remains constant during the 10 shots. So things like fatigue, distractions... are not of any concern for this question. The shooter is a professional and the 75% is his average over a long period of time as a professional so it can be considered a very accurate prediction of his future f.t. performance.

B) The shooter makes an honest attempt to make as many shots as he can (in other words, he doesn't intentionally miss any shots).

C) When I say at least $5$ in a row, I am talking about the longest streak only of the made shots so for example, using $M$ for make and $x$ for miss, this is NOT considered $5$ in a row $MMMxxxMMxx$ because the longest streak of makes is only $3$ in a row. However, $xxMMMMMxxx$ is $5$ in a row. Note that $MMMMxMMMMx$, even though it is $8$ out of $10$ made shots, is only considered $4$ in a row for this question (kinda like in bowling, $2$ strikes in a row, then an open frame, then a single strike is not considered a "turkey" ($3$ strikes in a row)).

D) Note that when I ask for the "longest" streak of makes, it could actually be the only streak (such as all $10$ in a row) or it could be the longer streak (if there are only $2$ streaks of makes) so I wanted to clarify that. Also cases such as $MMxMMxMMxx$ has no longest make streak (they are all of length $2$), but 2 is considered the "longest" in this case.

A bonus question related to this main question is which is the shooter more likely to get, $8$ in a row or all $10$ in a row?

Thank you.

David
  • 1,702
  • 1
    My solution here http://math.stackexchange.com/questions/59738/probability-for-the-length-of-the-longest-run-in-n-bernoulli-trials/59749#59749 uses generating functions, not recurrences but it will give you the answer. For $n=10$, $p=3/4$, and $m=5$ we get the chance $2187/4096$ or about $.534$. –  Sep 13 '14 at 14:04
  • I noticed that $2187/4096$ is simply $3 * (3/4)^6$. I wonder if this is a coincidence or if it has some significance to the solution to this problem. – David Sep 22 '14 at 09:44

5 Answers5

2

We can get an explicit expression by considering the cases where the consecutive successful free throws occur between two immediately flanking failures, or they do not (as in the case where the shooter begins or ends the streak at the beginning or end of the 10 trials).

In the first case, we see that we must calculate the sum $$\sum_{m=0}^3 (4-m) p^{5+m} (1-p)^2 = p^5(4-5p+p^5).$$ In the second case, we must calculate $$p^{10} + \sum_{m=0}^4 2p^{5+m}(1-p) = p^5 (2-p^5).$$ Thus our desired probability for a general $p \in [0,1]$ is given by $$p^5(4-5p+p^5)+p^5(2-p^5) = p^5(6-5p),$$ which for $p = 0.75$ is $2187/4096$.


In case one is interested, the following Mathematica code calculates the above polynomial via direct enumeration:

Total[Times @@ # & /@ Cases[Tuples[{p, 1-p}, 10],{___,p,p,p,p,p,___}]] // Simplify

And the subsequent command

% /. p -> 3/4

gives the probability for $p = 0.75$. Note that the speed of this command scales poorly with the number of trials, since for each additional trial there is a doubling of the length of the list generated by Tuples[]. However, generalizing the above sums is easy.

David
  • 1,702
heropup
  • 135,869
  • Answers like this is why I changed from a math major to a computer major. Solving this question on a computer is a "simple" simulation, just counting up the number of "good" events. I wouldn't even have to know anything about probability. I would just nest some loops, generating a string of length 10, and look for certain substrings of certain length and then just count 'em up. Since today's computers are fairly fast, the simulation should only take a few seconds time to run I would think. This math formula is rather "involved" for what seems to be a fairly "simple" question semantically. – David Sep 13 '14 at 17:07
  • @David It's ironic that you say that, since the way I evaluated the double sum, and checked it for accuracy, was with a computer. The sum itself is not that hard to understand from a mathematical perspective, nor is it difficult to grasp the intuition behind it. For instance, in the first case, regard a run of $5+m$ successes and the $2$ failures (one preceding, one following) as fixed, leaving $10-(5+m)-2 = 3-m$ random outcomes, which are binomially distributed. – heropup Sep 13 '14 at 19:25
  • I was just saying that a simple computer simulation using a string of length 10 is a simple way to solve this problem for someone that doesn't know much (or anything) about probability. I would think that many college students would not get this correct unless maybe they were advanced math majors. Of course a computer simulation using simple loops to simulate all the outcomes is only practical if the number of outcomes is reasonable and 10 shots has a reasonable # of outcomes. If I had instead said at least 50 out of 100, then a simple simulation described previously might not be practical. – David Sep 13 '14 at 21:05
  • My simple computer simulation would be 10 nested loops, each running from 0 to 3. 0 = miss and 1 thru 3 are makes. This is an easy way for me to simulate the 75% f.t. make probability per shot. Then simply count up the maximum length run of non zeros in a row which represents the make streak. If it is 5 or more then count that as a "good" event. When all done counting, divide by 1,048,576 (the number of events and the number of total innermost loop iterations) and you should get the correct answer which is 559,872 out of 1,048,576 which is about 53.39%. – David Sep 13 '14 at 21:14
  • In statistics often this kind of simulation is used but how do you know how accurate your solution is? (yes, you can also calculate standard errors). In general one tries to get exact solutions and if it becomes too difficult then one resorts to simulations. – user103828 Sep 14 '14 at 07:22
  • To me, solving a specific problem by using a computer simulation of it is reassuring and gives me a basis for crosschecking with some other method. One can question any single method of computation so it helps to be able to crosscheck using 2 or more solutions to see if the answers match. My claim is that the computer simulation is rather simple and that checks can be put in place to help verify the correctness of the output. In my simulation, the first "winner" of at least 5 makes in a row would be the string 0000011111 since 0 is a miss and 1..3 are all makes. The last would be 3333333333. – David Sep 14 '14 at 07:55
  • That derived formula of $p5(6−5p)$ is impressive because it seems to work for this problem and is very concise, especially when compared to the amount of work I had to do manually on a piece of paper. Kudos to the person that came up with this formula. I accepted the answer. One interesting thing though is the answer comes as 2187/4096 but there are only really 1024 possible events since the shooter can either make or not make each of the 10 shots, however, if the denominator was 1024 which is 2^10, then the answer would have to be 546.75/1024 which is not an integer number of good events. – David Sep 14 '14 at 18:36
  • Continuation of previous comment... I guess that has something to do with 75% freethrow percentage whereas if it was 50% instead, perhaps the correct number would be an integer numerator over 1024 since then each shot is basically a coinflip equivalent and you would then be counting the equivalent of 5 or more heads in a row for example. – David Sep 14 '14 at 18:45
  • @David Loosely speaking, yes. Even though there are only $1024$ elementary outcomes, they do not occur with equal probability when $p \ne 0.5$. And since the probability of the desired event must take into account the individual probabilities of seeing each possible outcome, the result will not, in general, be an integer divided by $1024$. – heropup Sep 14 '14 at 20:04
  • I figured the non 50% freethrow percentage was the culprit for not having a 1024 denominator. In my calculations I made it so that the 75% freethrow percentage was simulated by having 3 make balls and 1 miss ball for each shot, thus giving 4 "virtual" outcomes for each actual shot (3 are always makes and 1 is always a miss). This then allows me to simply run a computer simulation and just count up the number of good outcomes. In this case 559,872/1,048,576 which simplifies to 2187/4096 which is the correct answer. I didn't actually run the computer simulation but I suspect it would work fine. – David Sep 14 '14 at 20:53
  • I did the calculations on paper and derived the same formula but there are a lot of intermediate results not show in the summation above. When done out on paper, the p^7, p^8, p^9, and p^10 terms cancel out and we are only left with p^5 and p^6 terms. Another way of expressing the answer is 6(p^5)-5(p^6) which is what I got on paper. It seems hard for the reader to see how the final expression is derived just from those 2 summations. If I knew how to do the fancy formatting I would illustrate it here but I am a rookie so I don't. Just break it into cases (5,6,7...) and it is easier to solve. – David Sep 18 '14 at 21:33
1

The probability of the shooter getting at least five in a row is the sum of the probabilities of getting 5, 6, 7, 8, 9, or 10 in a row. These probabilities assume that any shot that is not part of the make sequence is missed. This is not what the original author asked so please be aware of that. \begin{align*} p_5 &= 6* (0.75)^5 * (0.25)^5 \\ p_6 &= 5* (0.75)^6 * (0.25)^4\\ p_7 &= 4* (0.75)^7 * (0.25)^3\\ p_8 &= 3* (0.75)^8 * (0.25)^2\\ p_9 &= 2* (0.75)^9 * 0.25\\ p_{10} &= (0.75)^{10} \end{align*} $P= p_5+p_6+p_7+p_8+p_9+p_{10} = 0.126$ or 12.6%.

Bonus: $p_8= 3*0.75^8 * 0.25^2= 0.018$ and $p_{10}= 0.75^{10} = 0.056$. So the probability of shooting 10 straight is higher

David
  • 1,702
  • This seems correct but the author asks for a recursive proof (I think he implies that he already has found your ''divide and conquer method''). – user103828 Sep 13 '14 at 14:04
  • 2
    This assumes he misses any shot that is not part of the streak. – Empy2 Sep 13 '14 at 14:10
  • These calculations are wrong. Look at P(8) for example. To get 8 in a row, the shooter can actually sink 9 shots so the formula you use is wrong because you are assuming he will make exactly 8 and miss exactly 2 but that is not required. For example, he can make the first 8 shots, miss the 9th shot, and make the 10th shot. There he made 9 shots total out of 10 but the longest streak of makes is 8 in a row. Also you have P(8) = 0.018 and P(10) = 0.056 so how could P(5) + (P6) + P(7) + P(8) + P(9) + P(10) = 0.07 if P(8) + P(10) is already 0.74? Your P(10) is right but the others seem wrong. – David Sep 13 '14 at 16:20
  • Yes this seems like you are calculating the probability of getting a streak of 5, 6... but missing all of the other shots. That is not what the original question asked for. Note that even for 5 in a row, the shooter can sink (make) 9 shots out of the 10. For example, make the first 5, miss the 6th, make the last 4. Another variation is miss the first 2 shots, make the next 5, miss the 8th shot, make the 9th, miss the 10th. You solved a much simpler question but not the one I asked. – David Sep 13 '14 at 16:46
0

$P(5)=$Prob(hits 5 but misses the previous and following shots)
$=0.75^5*0.25+4*0.25*0.75^5*0.25+0.25*0.75^5\\=(\frac34)^5(\frac14+\frac4{16}+\frac14)=729/4096$
$P(6)=(\frac34)^6(\frac14+\frac3{16}+\frac14)$
and so on.

Empy2
  • 50,853
0

Let $p$ be the probability of scoring and let $P_{m,n}$ be the probability of making at least $m$ in a row from $n$ shots. Conditioning on the first throw, $$ P_{m,n} = (1-p)P_{m,n-1} + p[(1-p)P_{m,n-2}+p(1-p)P_{m,n-3}+\ldots+p^{m-2}(1-p)P_{m,n-m}+p^{m-1}] $$ with initial conditions $$ P_{m,m} = p^m \qquad P_{m,m+1} = p^m +(1-p)p^{m} $$ Hence when $p=0.75$, $m=5$, and $n=10$, $$ P_{5,5} \approx 23.7\% \qquad P_{5,6} \approx 29.7\% \qquad P_{5,7} \approx 35.6\% \qquad P_{5,8} \approx 41.5\% \qquad P_{5,9} \approx 47.5\% \qquad P_{5,10} \approx 53.4\% $$ For bonus: $$ P_{10,10} \approx 5.6\% \\ P_{8,8} \approx 10.0\% \qquad P_{8,9} \approx 12.5\% \qquad P_{8,10} \approx 15.0\% $$

David
  • 1,702
user103828
  • 2,368
0

I also got the equivalent of 2187 / 4096 which is about 53.4% using a "brute force" placeholder divide and conquer method on paper. I actually got 559,872 / 1,048,576 which simplifies to 2187 / 4096. This problem (because of its small number of shots), can also be solved by using a computer simulation rather easily and quickly to help verify the correct answer.

For the bonus question, if my math is correct, getting 8 in a row is exactly as likely as getting all 10 in a row! Someone might think getting all 10 in a row is harder than just getting 8 in a row but to me they seem exactly as likely. I used a rather unusual math method to solve this in which I assume that since the f.t. % is 75%, that is kinda like having 4 basketballs, 3 orange ones and 1 multicolor (red, white and blue). It can be assumed that he will always make the orange basketball shot and always miss the multicolored shot and that he is handed random basketballs. By making this assertion, it is simpler for me to solve cuz I then don't have to work with fractions such as (3/4) and powers of them.

So using my "simplified" math, there are 4 possible outcomes for each shot... namely, he makes any of the 3 possible "orange shots" or he misses the "multicolored shot" so for 10 shots, the total number of possible outcomes is 4^10 = 1,048,576 although any of the 3 orange shots made can be interpreted as identical but they still count separately as part of the 1,048,576 possible total outcomes.

Now using this method, let's analyze what happens when we look for 10 makes in a row using placeholders and the following symbol placeholder definitions:

  • = make = 3 chances (out of 4 possible outcomes).

X = miss = 1 chance (out of 4 possible outcomes).

_ = any outcome = don't really care = 4 chances (out of 4 possible outcomes).

P(10) = ********** but since there are 3 orange balls, we get (3^10) / 1,048,576 = 59,049 which is about 5.63% of 1,048,576.

P(8) I broke down into 3 subcases namely:

********X- which is make the first 8, miss the 9th, and dont care about 10th (it can either be a make or a miss but it doesn't change the fact that this is 8 in a row).

The math for this is (3^8) * 1 * 4 = 26,244

Case 2 is X********X which is miss the first and last only.

The math for this is 1 * (3^8) * 1 = 6,561.

Case 3 is -X******** which is the reversal of case 1.

The math for this is 4 * 1 * (3^8) = 26,244

26,244 + 6,561 + 26,244 = 59,049. This is EXACTLY the same number of good outcomes as P(10) which is 3^10 so they are equally likely!

It just so happens that (3^10) = ((3^8) * 4) + 3^8 + ((3^8) * 4) because what you get on the right side of the equal sign is (3^8) * 9 but 9 is 2 additional powers of 3 so that is the same as 3^10.

It seems a little strange to me that getting 10 in a row and 8 in a row is equally likely but it kinda makes sense cuz he is much more likely to make a shot than to miss it but there is only one way to make all 10 but there are several ways to make 8 in a row so it "balances" out. I must say it is a coincidence they are exactly equal.

Note this original question is not trivial at all such as probability of making any 5 shots of the 10. By adding the "in a row" clause, it makes the math much more involved.

Also note that P(5), P(6), P(7), and P(9) can be computed using this same method and easily on a single sheet of paper (drawing out the placeholders helped me a lot in solving it). Of course P(x) here means exactly x shots in a row made.

P(5) has 6 different subcases:

*****X----

X*****X---

-X*****X--

--X*****X-

---X*****X

----X*****

Remember a "-" is a placeholder meaning "don't care" so it can either be a make (*) or a miss (X). Note that P(5) is about 17.8% and P(5) means 5 in a row (maximum make streak length).

David
  • 1,702
  • I was wondering if someone could comment on my solution. I tried to make it simple by breaking it into smaller subproblems (such as solve for P(5), P(6)...). It is interesting to see that P(5) is the most likely and P(9) is the least likely. P(8) and P(10) are equally as likely which is also interesting. One might think that making all 10 shots in a row is the "hardest" but in reality it is not. Making 9 in a row is the hardest cuz there are only 2 ways to do it (miss the first or miss the last) and make all the others. The chances of that happening are 2/3rds that of making all 10. Cool. – David Sep 13 '14 at 17:20
  • It looks okay but it's ad hoc (what would you do if instead of 10 shots and 5 in a row, you had 12 shots and 6 in a row?!). See my solution if you want to use a recursive formula like your teacher suggested that generalizes. – user103828 Sep 14 '14 at 07:13
  • For the 12 shot 6 in a row variation, I would just recompute using my same simplistic method. As long as the numbers are reasonably manageable, the "ad hoc" approach works. If the problem was out of 100 shots, what are the chances of getting at least 50 in a row, my method would not be practical. Sometimes a specific solution is all one needs but I agree a generic solution is more robust/flexible. – David Sep 14 '14 at 08:05
  • Yes, with small numbers you can just use your method but even with smaller numbers it is easy to miss a case. – user103828 Sep 14 '14 at 08:17
  • Also, my "quadrupling" of the # of outcomes per shots works in this case because the f.t. % is 75% which can be simulated by changing one 75% outcome each shot to 4 outcomes, 3 that are makes and 1 that is a miss. If it had been some "oddball" % like 74.5% I could not use that simplification to help remove fractions, however the other formulas here could be used with any probability so my technique is very restrictive that way. – David Sep 14 '14 at 14:40