17

I just received a probability problem from a friend via a text and by the time I took to read it, I was sent a solution as well - which is confusing.. The question goes like this..

A person shoots basketball 100 times. First time he scores a point and second time he misses it. For the following shots, the probability of him scoring a point is equal to number of points scored before this shot divided by number of shots taken before this shot, for ex: if he is into his 21st shot and he has scored 13 points in the first 20 shots then the probability of him scoring in 21st shot is 13/20. What is the probability of scoring 66 points in the 100 shots (including the first two)

The confusing solution:

$\dfrac1{(n-1)} \ge \dfrac1{99}$

Can someone please explain the logic behind this?

HymnZzy
  • 273
  • 3
    I think what your friend is hinting at is that all possible point scores, from $1$ to $99,$ are equally likely. – bof Sep 12 '17 at 04:28
  • 1
    In general, if $n\ge2$ shots are taken, all possible scores from $1$ to $n$ are equally likely, i.e., each has probability $\frac1{n-1}.$ Proof by induction on $n$ seems like a good idea. – bof Sep 12 '17 at 04:31
  • 1
    Agree.. but even arriving at (1/(n-1)) requires some logic.. I'm failing to understand it.. – HymnZzy Sep 12 '17 at 04:38
  • 2
    Have you worked out the probability distribution for $n=3$? for $n=4$? for $n=5$? Have you tried mathematical induction? Or have you actually worked out a mathematical proof, and you are asking for an intuitive explanation? It's not clear what you're asking. – bof Sep 12 '17 at 04:42
  • 2
    To realize that $1/(n-1)$ is likely the answer, you should explicitly work out the solutions for $3$ throws, $4$ throws, and $5$ throws, like @bof suggested. Then, once you are reasonably suspicious that $1/(n-1)$ is true for every case, you should try to prove it using induction. – Zubin Mukerjee Sep 12 '17 at 05:00
  • @bof.. Yes.. I checked it out, it works I was more looking into how one can arrive at that solution.. In my prime (read schooling) days I would have solved it but haven't touched on probability for quite sometime.. – HymnZzy Sep 12 '17 at 05:24
  • @ZubinMukerjee. You answer is sound. – HymnZzy Sep 12 '17 at 05:24
  • 1
    Reference: This is Putnam exam 2002 problem B1. You can search for additional solutions online. –  Sep 12 '17 at 13:17

7 Answers7

21

Let $P(x, n)$ be the probability that you have $x$ points after $n$ throws. We are given that $$P(1,2) = 1$$

You are also given information about the probability of scoring at any step, if you know the number of points previously scored. In particular:

$$P(\text{scoring from } x-1, n-1 ) = \frac{x-1}{n-1}$$

$$P(\text{scoring from } x, n-1 ) = \frac{x}{n-1}$$


Now, note that in order to have $x$ points after $n$ throws, we can either have $x-1$ points after $n-1$ throws, then score a point, or have $x$ points after $n-1$ throws, then not score a point.

We can write this as follows:

$$P(x,n) = P(x-1,n-1)P(\text{scoring from } x-1, n-1 ) + P(x,n-1)(1 - P(\text{scoring from } x, n-1 ))$$

This is the same as

$$P(x,n) = P(x-1,n-1)\left(\frac{x-1}{n-1}\right) + P(x,n-1)\left(1 -\left(\frac{x}{n-1}\right)\right)$$ $$P(x,n) = P(x-1,n-1)\left(\frac{x-1}{n-1}\right) + P(x,n-1)\left(\frac{n-1 - x}{n-1}\right)$$

Call this last result "potato", because we will refer to it later as such.


Statement:

For all $n > 1$ and $1 \leq x \leq n-1$, $$P(x,n) = \frac{1}{n-1}$$

Proof:

Our base case is $n=2$, for which we are given that $$P(1,2)=1$$

For the inductive step, we assume that for some $k \geq 2$, the following is true for all $x$ between $1$ and $k-1$, inclusive: $$P(x,k) = \frac{1}{k-1}$$

Now suppose $y$ satisfies $2 \leq y \leq k-1$, and we wish to find $P(y, k+1)$. By the previous result "potato", we can write

$$P(y,k+1) = P(y-1,k)\left(\frac{y-1}{k}\right) + P(y,k)\left(\frac{k - y}{k}\right)$$

But we know from our inductive assumption that $P(y,k) = P(y-1,k) = 1/(k-1)$. Therefore,

$$P(y,k+1) = \left(\frac{1}{k-1}\right)\left(\frac{y-1}{k}\right) + \left(\frac{1}{k-1}\right)\left(\frac{k - y}{k}\right)$$

$$P(y,k+1) =\frac{y-1}{k(k-1)} + \frac{k-y}{k(k-1)}$$

$$P(y,k+1) = \frac{k-1}{k(k-1)}$$

$$P(y,k+1) = \frac{1}{k}$$

We are finished for the cases when $2 \leq y \leq k-1$, but we still have to consider the edge cases $y=1$ and $y=k$.

After $k+1$ throws, in order to have only $1$ point, we must miss every throw from the third to the $(k+1)^\text{th}$. The probability of missing the third throw is $1/2$. Given this, the probability of missing the fourth throw is $2/3$. Given this, the probability of missing the fifth throw is $3/4$, and so on. We get a telescoping product:

$$P(1, k+1) = \prod_{j=1}^{k-1} \frac{j}{j+1} = \frac{1}{2} \cdot \frac{2}{3} \cdot \frac{3}{4} \cdots \frac{k-2}{k-1} \cdot \frac{k-1}{k} = \frac{1}{k}$$

Similarly, after $k+1$ throws, in order to have exactly $k$ points, we must get a point on every throw from the third to the $(k+1)^\text{th}$. The probability of getting a point on the third throw is $1/2$. Given this, the probability of getting a point on the fourth throw is $2/3$. Given this, the probability of getting a point on the fifth throw is $3/4$, and so on. We once again have a telescoping product:

$$P(k, k+1) = \prod_{j=1}^{k-1} \frac{j}{j+1} = \frac{1}{2} \cdot \frac{2}{3} \cdot \frac{3}{4} \cdots \frac{k-2}{k-1} \cdot \frac{k-1}{k} = \frac{1}{k}$$

This completes the inductive step, so we've finally shown that for all $n >1$ and $1 \leq x \leq n-1$, $$\boxed{P(x,n) = \frac{1}{n-1}\,}$$


In particular, your question asked for $$P(66, 100) = \frac{1}{99}$$

  • I feel as though such a simple final answer should have a slick direct proof, as opposed to this awfully lengthy proof by induction ... – Zubin Mukerjee Sep 12 '17 at 05:16
  • It's not really that lengthy! The pain comes from two places. First, doing the calculations very carefully step by step; that's kind to the reader but you could easily miss out a lot of the explicit calculation and make it look shorter :-). Second, you have those edge cases -- but they don't need to be nearly so edgy. You can compute them the same way as the non-edge cases! "But then e.g. when you compute P(1,k+1) there'll be a factor P(0,k) which isn't 1/(k-1)." Yes, but it is multiplied by zero so that doesn't matter. – Gareth McCaughan Sep 12 '17 at 10:11
  • There should be only one edge case (y=1) and it is not much of an edge case nor particularly tricky (since it has the same recurrence rule, just one of the lower probabilities is zero). You do not need an infinite product at any stage of the proof. You know by the inductive hypothesis what P(1, k) is. The remaining complexity is that some of the algebra is quite laboured (not a bad thing) but you also repeat yourself a bit (eg P(x-1, n-1) probably doesn't need mentioning at the start). @GarethMcCaughan has anticipated some of my remarks, but he won't mind. – Francis Davey Sep 12 '17 at 11:20
  • 1
    I find it strange that you would box an equation but not label an equation with \tag{1} when you refer to it later on in the post. – Kyle Kanos Sep 12 '17 at 12:39
  • @KyleKanos Thank you! I didn't know about \tag, I will definitely make use of that in the future. I knew that in TeX files, \begin{equation} gives the equation a label by default, whereas \begin{equation*} does not, but I had never used \tag – Zubin Mukerjee Sep 12 '17 at 13:58
10

Among the 98 shots from 3rd to 100th we need 65 scoring shots. These can be chosen in $\binom{98}{65}$ ways. Let us compute the probability of one of such arrangements of scoring shots. The numerator contributed by the scoring shots contains all the numbers from 1 to 65 and the denominator is $99!$. Consider the numerator contributed by the non scoring shots. These are of the type $k - m_k$ where $m_k$ is the score up to the $k-1$ shots. The maximum of these numbers is 33. Note that no two of these can be equal. Thus we have 33 slots and we have 33 numbers to fill and hence all the numbers from 1 to 33 are contributed by the non scoring shots. Thus the probability of any such arrangement is the same and equals $\frac{33!65!}{99!}$. Since there are $\binom{98}{65}$ such arrangements, the required probability is $\binom{98}{65} \frac{33!65!}{99!} = \frac{1}{99}$

  • Can you explain this "The numerator contributed by the scoring shots contains all the numbers from 1 to 65 and the denominator is 99!" – HymnZzy Sep 12 '17 at 06:23
  • The numerator for a scoring shot is the score obtained so far. Hence it will be a number between 1 and 65. Note that all the 65 numbers will appear (since the score increases by 1 at a scoring shot) –  Sep 12 '17 at 06:24
  • The denominator at the $k$th shot is $(k-1)$, $k = 3, 4, \ldots, 100$ –  Sep 12 '17 at 06:44
6

I think the existing answers are fine, but here's one that's shorter than the induction proof in Zubin Mukerjee's answer and maybe more elementary than the counting argument in Muralidharan's, and that (I think) gives more sense of why the thing is true.

The idea is to make a new thing that's "obviously" distributed the same way as the number of hits after $n$ shots, and that also is "obviously" equal to $1/(n-1)$.

So, suppose we have an infinite sequence of (independent) random numbers $x_1,x_2,x_3,\dots$ -- they can come from any nice smooth distribution you like, because all we'll care about is which ones are bigger than which other ones, but for concreteness let's say uniform between 0 and 1. And let's write $a_n$ for the position of $x_1$ among the first $n-1$ of these numbers when written in increasing order: so $a_n=1$ if $x_1$ is the smallest of them, and $a_n=n-1$ if $x_1$ is the largest. Equivalently: $a_n$ is the number of $\{\,x_1,\dots,x_{n-1}\,\}$ that are $\leq x_1$. Note that the probability that any two $x_j$ are equal is zero, and I'll just ignore that possibility in what follows.

First: clearly $a_n$ is equally likely to be any number from $1$ to $n-1$, because these numbers are picked independently from the same distribution: there's no reason why any of them should be more likely to be (say) 37th than any other.

Second: I claim that $a_n$ is distributed the same way as the number of hits after $n$ shots, for $n\geq2$. When $n=2$ these are both always 1. When going from $n$ to $n+1$, $a_n$ either doesn't change (if $x_n>x_1$) or increases by 1 (if $x_n<x_1$). And the probability of the latter is exactly $a_n/n$, which is the same as the probability that the number of hits increases by 1. So, by induction, the two always match.

(Why is that probability $a_n/n$? Because there are $n$ equally likely positions for $x_n$ relative to the first $n-1$ numbers, and $a_n$ of them make $x_n<x_1$ by definition of $a_n$.)

This proof is "really" the same as Zubin Mukherjee's: all the calculations are the same. (Except for the edge cases, which can be done more easily than in Zubin's answer as it currently stands: I'll leave a comment there to point out how.) Its real advantage (in my eyes) is that once you understand it, it makes it obvious that the answer is what it is.

4

This answer is probably too advanced for OP or anyone who doesn't know Bayesian analysis, but it's an interesting thing to notice, and to me (and probably to many others) the first thing that comes to mind, so I shall give it here for completeness.

For the following shots, the probability of him scoring a point is equal to number of points scored before this shot divided by number of shots taken before this shot

So if the player has scored $a$ shots and missed $b$ shots, the probability of hitting the next one is $\frac{a}{a+b}$. This is actually what you would get if you did Bayesian inference with prior distribution $\mathrm{Beta}(0,0)$ for the probability that the player scores. After the first two shots, the distribution is $\mathrm{Beta}(1,1)$, that is, the uniform distribution. The question then becomes:

The player scores with unknown probability $p$. The prior distribution for $p$ is uniform on $[0,1]$. What is the probability that he scores exactly $65$ times on the next $98$ tries?

In other words, if $X$ is the number of points scored, then $X \sim \mathrm{Bin}(98,p)$, where $p$ has a uniform prior distribution.

It is an interesting fact that when $X \sim \mathrm{Bin}(n,p)$ with a uniform prior on $p$, the predictive distribution on $X$ is uniform in $\{0,1,\dots,n\}$. There are many ways to see this, but the most straightforward is just to calculate the probability.

$P(X=65) = \int_0^1 P(X=65|p) \, dp = \int_0^1 \binom{98}{65} p^{65} (1-p)^{98-65} dp $

Now we can recognize the beta function here or plug this to WolframAlpha or similar to get $1/99$.

JiK
  • 5,815
  • 19
  • 39
  • Glad that you mentioned it is too advanced for me.. :) As I mentioned earlier in the comments, I last interacted with probability (with mind) was about 12 years ago.. Muralidharan's first answer jogged my memory :D – HymnZzy Sep 12 '17 at 13:01
1

Whatever is the score, the probability is still $\dfrac{1}{99}$. This can be seen as follows:

If the score is $k$, all numbers from 1 to $k-1$ appear in the numerator. Now consider the non scoring shots. Suppose that the second non scoring shot occurs as the $p$th shot. This means, up to $p-1$ shots, the score is $p-2$ and the $p$th shot is non scoring. Thus the probability is $\dfrac{(p-1) - (p-2)}{p-1}= \dfrac{1}{p-1}$ and this contributes the integer 1 to the numerator. Now consider the third non scoring shot. If this is the $q$th shot, up to $q-1$ shots, we have $q-3$ scoring shots and hence this shot contributes the number 2 to the numerator. Thus at the $m$th non scoring shot, we get the number $m-1$. Hence the numerator is $(k-1)!(99-k)!$. Thus the probability is $$\frac{\binom{98}{k-1}(k-1)!(99-k)!}{99!} = \dfrac{1}{99}$$

1

Here is a slightly shorter version of Zubin Mukerjee's solution, with a (I believe) simpler argument for the edge cases $P(1,N)$ and $P(N-1,N)$. The arguments are effectively the same up to that point.

We denote the probability that we have a score of $x$ after $N$ shots as $P(x,N)$. The problem states that $$ P(1,2)=1,$$ hence $$ P(0,2) = P(2,2) = 0.$$

We also have that the probability that we score on the $N$-th shot with a previous score $x$ is $\dfrac{x}{N-1}$. We can write $P(x,N)$ as the sum of the chance of scoring on the $N$-th shot with a previous score of $x-1$, and the chance of not scoring the $N$-th shot with a previous score of $x$, or $$ P(x,N) = P(x-1,N-1) \frac{x-1}{N-1} + P(x,N-1)\left(1 - \frac{x}{N-1}\right). $$


We conjecture that, for $N>1$, $$ P(x,N) = \begin{cases} \frac1{N-1} \quad & 1\leq x \leq N-1,\\ \,\,\, 0 & \text{otherwise}. \end{cases} $$

Base Case

Above, we see that $P(1,2) = 1$ and $P(0,2) = P(2,2) = 0$, which satisfies the conjecture.

Induction

Assuming the conjecture, we have $$ P(x,N-1) = \begin{cases} \frac1{N-2} \quad & 1\leq x \leq N-2,\\ \,\,\, 0 & \text{otherwise}. \end{cases} $$ For $2\leq x \leq N-2$, $$ \begin{aligned} P(x,N) &= P(x-1,N-1) \frac{x-1}{N-1} + P(x,N-1)\frac{N-x-1}{N-1}\\ &= \frac1{N-2}\cdot\frac{x-1}{N-1} + \frac1{N-2}\cdot\frac{N-x-1}{N-1} \\ &= \frac1{N-1}. \end{aligned} $$ Edge cases (alternative argument) - for $x=1$, $$ \begin{aligned} P(1,N) &= P(0,N-1) \frac{1-1}{N-1} + P(1,N-1)\frac{N-1-1}{N-1}\\ &= 0 + \frac1{N-2}\cdot\frac{N-2}{N-1}\\ &= \frac1{N-1}. \end{aligned} $$ For $x=N-1$, $$ \begin{aligned} P(N-1,N) &= P(N-2,N-1) \frac{N-2}{N-1} + P(N-1,N-1)\frac{N-(N-1)-1}{N-1}\\ &= \frac1{N-2}\cdot \frac{N-2}{N-1} + 0\\ &= \frac1{N-1}. \end{aligned} $$

Finally, it should be clear that $P(0,N) = P(N,N) = 0$ (since for $N=2$ we have already scored one and missed one).

Steve
  • 127
1

By way of context, this problem maps into the Polya Urn scheme.

In the Polya scheme you start with an urn containing one red ball and one blue. At each step you pick a ball at random from the urn, note its color, and return two balls of this color into the urn. Use arguments similar to those in this question's other answers to conclude that after $n$ steps, the number of red balls in the urn has a uniform distribution over $\{1,\ldots, n+1\}$.

To map the basketball scheme into the urn scheme, regard a scored shot as a red ball and a missed shot as blue. Since the urn starts with a red and a blue, the basketball scheme with $n$ shots corresponds to an urn scheme with $n-2$ steps. So the probability that you've scored $k$ points after after $n$ shots is the probability that there are $k$ red balls in the urn after $n-2$ steps, which is $1/(n-1)$.

grand_chat
  • 38,951