5

Let X be the smallest value obtained when k numbers are randomly chosen from the set 1,...,n. Find E[X] by interpreting X as a negative hypergeometric random variable.

This is Self Test Exercise 7.7 of Sheldon's A First Course in Probability.

The way I approached this is to first consider (for example) P(X = 1). Using the suggestion to take X as a negative hypergeometric random variable, we have

$$P(X = 1) = \frac{\binom{1}{1}\binom{n - 1}{k - 1}}{\binom{n}{k}}$$

My idea was that we have one element 1 and the rest $k - 1$ elements from ... the remaining $n - 1$ elements. Similarly, for $P(X = 2)$, we have one way of getting the minimum element $2$, and $\binom{n - 2}{k - 1}$ ways of getting the other elements (well, except 1).

Similarly, we can deduce that

$$P(X = i) = \frac{\binom{n - i}{k - 1}}{\binom{n}{k}}$$

As a result, I got

$$E(X) = \sum_{i = 1}^n i \times \frac{\binom{n - i}{k - 1}}{\binom{n}{k}}$$

The answer in the book is $\frac{n + 1}{k + 1}$, which is just so much more elegant than what I came up with, and I'm not seeing how my answer reduces to theirs.

My questions are:

  • Is my approach correct? If not, what mistake(s) did I make in my analysis?
  • If my answer is correct, how do I get to the expected result?
  • 2
    You can factor out the $\frac{1}{\binom{n}{k}}$ as that remains constant. Similarly, $k$ and $n$ remain constant in the upper term. What you have then is a sum of binomial coefficients resembling a hockeystick in pascal's triangle which has a well known simplification. – JMoravitz Feb 13 '24 at 18:55
  • 2
    The gaps should have equal (expected) length. There are $k+1$ gaps and there are $n-k$ unselected numbers, hence each gap has expected length $\frac {n-k}{k+1}$. Add $1$ to that to get the expected minimum. – lulu Feb 13 '24 at 19:25
  • The techniques used in this question apply here as well. – lulu Feb 13 '24 at 19:28
  • @JMoravitz The trouble I have is with the i (there was a small typo in E(X) which I've fixed) - does the hockeystick identity still work then? – Leaderboard Feb 14 '24 at 05:30

3 Answers3

2

An easier way is to use the complementary CDF: $$P(X > i)=1-F_X(x),$$ to compute the expectation as follows:

$$\mathbb E(X) = \sum_{i = 0}^{n-k}P(X > i),$$

which can be used for any non-negative random variable $X$; see here for more details. Indeed,

$$P(X >i) = \frac{\binom{n - i}{k}}{\binom{n}{k}}.$$

Hence,

$$\mathbb E(X) =\sum_{i = 0}^{n-k}\frac{\binom{n - i}{k}}{\binom{n}{k}}=\frac{\binom{n +1}{k+1}}{\binom{n}{k}}=\frac{n +1}{k+1}.$$

To compute the summation, I used the hockey-stick identity.

Direct method:

You could directly calculate the expectation using the PMF, $P(X= i)$:

$$\mathbb E(X) = \sum_{i = 1}^{n-k+1} i \times P(X = i)=\sum_{i = 1}^{n-k+1} i \times \frac{\binom{n - i}{k-1}}{\binom{n}{k}}.$$

To manage this, use the following

$$\sum_{i = 1}^{n-k+1} i \times \binom{n - i}{k-1}=(n+1)\sum_{i = 1}^{n-k+1} \binom{n - i}{k-1} - \sum_{i = 1}^{n-k+1}(n+1-i) \times \binom{n - i}{k-1}$$

and then use the hockey-stick identity for each summation, but after applying the following to the second:

$$(n+1-i) \times \binom{n - i}{k-1}=k \times \binom{n - i+1}{k}.$$

Amir
  • 4,305
1

This approach uses the principle of symmetry to find the expected value.

Distribute the $n$ numbers in ascending order uniformly on a $0-1$ scale, so they are at $\frac{m}{n+1},$ for $m=1,2,3,...n$

Similarly, the $k$ sampled numbers are at $\frac1{k+1}, \frac2{k+1}, \frac3{k+1} ... \frac{k}{k+1}$ on an average.

(The $n$ numbers partition the scale into $n+1$ equal segments and similarly the $k$ sampled numbers partition the scale into $k+1$ equal segments).

We have to find the point at which the minimum sampled point lies, i.e. $\frac{1}{k+1} = \frac{m}{n+1}$, $$=> m = \frac{n+1}{k+1}$$

1

This admittedly doesn't answer the question about your method, but here is a completely different approach. Fix $k$ and write $E_n$ for the expectation for $n$. We'll prove $E_n=\frac{n+1}{k+1}$ by induction on $n$; clearly it holds for $n=k$.

If $1$ was one of the integers chosen, which happens with probability $k/n$, the minimum is $1$.

If $1$ was not one of the integers chosen, which happens with probability $1-k/n$, then subtracting $1$ from each integer gives a uniformly random choice of $k$ integers from $1,\ldots,n-1$. The expected value of the minimum of these is $E_{n-1}$, and the minimum of the original set was $1$ more than the minimum of these.

Therefore we get $$E_n=\frac{k}{n}+\frac{n-k}{n}(E_{n-1}+1)=1+\frac{n-k}{n}E_{n-1}.$$ Since $E_{n-1}=\frac{n}{k+1}$ by the induction hypothesis, this gives $E_n=\frac{n+1}{k+1}$.