3

The Wikipedia article on order statistics mentions the following result on the order statistics of an exponential distribution with rate parameter, $\lambda$:

$$X_{(i)} = \frac{1}{\lambda}\sum\limits_{j=1}^i \frac{Z_j}{n-j+1} \tag{1}$$

It provides no proof of this. How do I prove it?


My attempt:

We know that to get X_{(i)} for a distribution with inverse CDF $F_X^{-1}(x)$, we first get the corresponding order statistic of the uniform ($U_{(i)}$) and then apply the inverse CDF to it.

We know that $U_{(i)} \sim B(i,n-i+1)$. And the inverse CDF of the exponential distribution is: $F_X^{-1}(x) = -\frac{\log(1-x)}{\lambda}$.

This means that the distribution of $X_{(i)}$ should be: $-\frac{\log(1-U_{(i)})}{\lambda}$

Also, $1-U_{(i)} \sim U_{(n-i)}$. So, the distribution of the order statistic becomes:

$$X_{(i)}\sim -\frac{\log(U_{(n-i)})}{\lambda}$$

We have a Beta inside a logarithm. Don't see a path to equation (1) except maybe expressing the Beta as a Gamma and then noting that the Gamma is a sum of exponentials?

Rohit Pandey
  • 6,803
  • 1
    It can be formally shown using a transformation as used here, which is essentially a result due to Sukhatme: https://stats.stackexchange.com/a/272509/119261. Also discussed at https://math.stackexchange.com/a/509909/321264 and its linked posts. – StubbornAtom Dec 30 '21 at 18:41

2 Answers2

5

This can be shown quite directly. Fix an $i$. Then note that $$ P(X_{(i+1)} -X_{(i)}> x| X_{(i)} = y) = P(X_{(i+1)} > x+y|X_{(i) } = y).$$

Now, the latter probability is that of finding that the smallest of $n-k$ exponential random variables is bigger than $x+y$, given that each is at least $y$. But, due to the independence, and the memorylessness of the exponential distribution, this just works out to $n-i$ independent exponentials being bigger than $x$, i.e. $$ P(X_{(i+1)} - X_{(i)} > x| X_{(i)} = y) = e^{-\lambda (n-i) x}. $$ But, since the right hand side is independent of $y$, we can integrate over the conditioning and conclude that $X_{(i+1)} - X_{(i)}$ has the distribution of a $\mathrm{Exp}( \lambda (n-i))$ random variable. Further, this random variable is independent of $X_{(i)}$

In fact, yet more is true - instead of conditioning over only the value of $X_{(i)},$ we could have conditioned on the values of $X_{(0)} := 0, X_{(1)}-X_{(0)}, X_{(2)} - X_{(1)}$ and so on all the way to $X_{(i)} - X_{(i-1)}$ without changing the conclusion or the reasoning (again, this critically relies on the memorylessness of the exponential law). This means that

  1. $ \{ X_{(i+1)} - X_{(i)} \}_{i = 0}^{n-1} $ are mutually independent,
  2. $X_{(i+1)} - X_{(i)}$ is $\mathrm{Exp}(\lambda (n-i))$ distributed, which in turn is the same as the law of $Z/\lambda(n-i)$ for a standard exponential $Z$.

But now the result is obvious - \begin{align} X_{(i)} &= \sum_{j = 0}^{i-1} X_{(j+1)} - X_{(j)} \\ &\equiv \sum_{j = 0}^{i-1} \frac{Z_j}{\lambda (n - j)}.\end{align}

3

I think I came up with an approach that might work. We've generated $n$ exponentials with rate $\lambda$. Let's start with $X_{(1)}$. It's the minimum of $n$ exponentials. We know this is just another exponential with rate $n \lambda$.

$$X_{(1)} \sim \frac{Z_1}{n \lambda}$$

Now, we're at $X_{(1)}$, and there are $n-1$ exponentials to go. By the memory-less property, the additional time it'll take for the next exponential to arrive from here is simply the minimum of $n-1$ exponentials. So,

$$X_{(2)}\sim \frac{Z_1}{n \lambda} + \frac{Z_1}{(n-1) \lambda}$$

And we can take this all the way to $i$.

Rohit Pandey
  • 6,803
  • 1
    Oh - didn't notice this. I'll leave my answer up just because it has slightly more detail. BTW I think this argument is from an old paper of Renyi, which I recall being quite nice, you might want to look for it. – stochasticboy321 Dec 30 '21 at 04:41