5

If we have a random sample $X_1,X_2, \ldots, X_n \stackrel{\text{iid}}\sim f(x\mid\theta)=e^{-(x-\theta)}I(x >\theta)$. We want to prove $$2\sum X_i-2n X_{(1)} \sim \chi^2_{n-2}$$ where $X_{(1)}$ is the smallest order statistic.

I tried: $$2\sum X_i-2n X_{(1)} =2\left[\sum X_i-n X_{(1)}\right]=2\left[\sum X_{(i)}-n X_{(1)}\right]=2\left[\sum \left(X_{(i)}- X_{(1)}\right)\right]$$ And I was trying to find the distribution of $$X_{(i)}- X_{(1)}$$

And I searched that $$X_{(i)}-X_{(i-1)} \sim \operatorname{Exp}\left(\frac{1}{n+1-i}\right) \text{ if } X_i \stackrel{\text{iid}}\sim \operatorname{Exp}(1)$$ Any ideas? Thank you~

TZakrevskiy
  • 22,980
Matata
  • 2,088
  • 2
    It should be $\chi^2_{2n-2},$ right? – spaceisdarkgreen Nov 29 '17 at 03:42
  • @spaceisdarkgreen I am not quite sure, it's just something my professor mentioned in class and he wrote this. – Matata Nov 29 '17 at 03:44
  • I know the following. $\chi ^{2}(2)$ is the same as exponential distribution. with $1/2$ coefficient. It is also easy to derive that $X_{(1)}$ has exponential distribution with coefficient $n$. However I can't say that sum of $\chi^{2}$ distributed variables has $\chi ^{2}$ distribution. – kolobokish Nov 29 '17 at 03:48
  • 2
    First, by translation invariance of the quantity you're calculating, WLOG you can set $\theta =0.$ You're told the minimum $m = X_{(1)}$... so you know that you have $n-1$ others that larger than this minimum. By memoryless property, conditional on this information the others are exponential with location $m,$ so their difference with $X_{(1)}$ is standard exponential. So you have the sum of $n-1$ independent standard exponentials ($\Gamma(n-1,1)$). By the relationship of Gamma and chi-squared, two times this is $\chi^2_{2n-2}.$ (I comment rather than answer since this is far from rigorous.) – spaceisdarkgreen Nov 29 '17 at 03:49
  • @spaceisdarkgreen Thank you for your answer. I was trying to use the sum of the exponential is Gamma at first but I have a little doubt, are they independent? – Matata Nov 29 '17 at 03:53
  • @Nan That's the not-so-tight part of the argument, but I think so. The key to tightening it up, I think, is doing a law of total probability decomposition over events of the form "$X_i$ is the minimum and $X_i =m.$" I think it's doable to show $(X_j-X_i)$ for $j\ne i$ are conditionally independent standard exponentials on this event. – spaceisdarkgreen Nov 29 '17 at 04:11
  • 2
    (Hopefully it's clear that I'm not saying that $X_{(j)} -X_{(1)}$ are independent in any sense... that's certainly not true.) – spaceisdarkgreen Nov 29 '17 at 04:22
  • @spaceisdarkgreen, I simulated the case $n = 10,, \theta = 0$ to see if this seems to work at all and to confirm your correction of the degrees of freedom. $\mathsf{Chisq}(n-2)$ doesn't fit the histogram of the simulated values, but $\mathsf{Chisq}(2n-2)$ does. Of course, this doesn't prove anything, but (to me anyhow) it offers hope your argument might be made rigorous. – BruceET Nov 29 '17 at 05:39
  • 1
    Definitely it should be $\chi^2_{2(n-1)},$ not $\chi^2_{n-2}. \qquad$ – Michael Hardy Nov 29 '17 at 06:23
  • @spaceisdarkgreen Should that be $\Gamma(n-1,2)$ to make a $\chi^2_{2n-2}$? – Matata Nov 29 '17 at 15:09
  • Yes, and two times a $\Gamma(n-1,1)$ is a $\Gamma(n-1,2).$ (It's confusing cause there's two common parametrizations. Here the second parameter is scale ("$\theta$"), not rate ("$\lambda$"). – spaceisdarkgreen Nov 29 '17 at 16:34
  • My post here should be relevant. – StubbornAtom Jun 28 '18 at 09:48
  • I would try to show that $$X_i - X_{(1)} \sim \begin{cases} \delta_0 & \text{if } X_i = X_{(1)}, \ \chi^2_2 & \text{otherwise}. \end{cases}$$ Then I would wonder how to show that $X_i-X_{(1)},, X_j-X_{(1)}$ are independent. Then you standard properties of the chi-square distribution. $\qquad$ – Michael Hardy Sep 06 '18 at 16:32

3 Answers3

3

Let $X_1,X_2,\ldots,X_n$ be a random sample from exponential distribution with mean $1.$ Then joint probability density of order statistics $X_{(1)},X_{(2)},\ldots,X_{(n)}$ is $$f_{X_{(1)},X_{(2)},\ldots,X_{(n)}}(x_1,x_2,\ldots,x_n)= n! e^{-\sum_{i=1}^{n}x_i}, 0\leq x_1\leq x_2\leq \cdots \leq x_n \leq \infty$$ Let us consider transformation

$$Y_1=nX_{(1)}, Y_2=(n-1)(X_{(2)}-X_{(1)}), Y_3=(n-2)(X_{(3)}-X_{(2)}),\ldots,Y_n= X_{(n)}-X_{(n-1)}$$

$$\Rightarrow X_{(1)}=\frac{Y_1}{n}, X_{(2)}=\frac{Y_1}{n}+\frac{Y_2}{n-1},\ldots, X_{(n)}=\frac{Y_1}{n}+\frac{Y_2}{n-1}+\frac{Y_3}{n-2}+\cdots+Y_n$$

Jacobian of above transformation is $\frac{1}{n!}$.

So joint probability density function of $Y_1,Y_2,\ldots,Y_n$ is given by

$f_{Y_1,Y_2,\ldots,Y_n}(y_1,y_2,\ldots,y_n)= e^{-\sum_{i=1}^n y_i}; 0\leq y_1,y_2,\ldots,y_n\leq \infty $.

This follows, using factorization theorem, $Y_1,Y_2,Y_3,\ldots,Y_n$ are identically and independently distributed as exponential variate with mean $1.$

$\Rightarrow Y_i=(n-i+1)(X_{(i)}-X_{(i-1)}) \stackrel{\text{iid}}{\sim} \operatorname{exp}(1)$; $i=2,3,\ldots,n$.

Hence $\sum_{i=2}^{n} Y_i= \sum_{i=1}^n(X_i-X_{(1)})$ is sum of $(n-1)$ independent $exp(1)$ variates, so $\sum_{i=1}^n(X_i-X_{(1)})\sim \operatorname{gamma}(n-1)$.

Ref: "Order Statistics & Inference" by Balakrishnan & Cohen. https://www.amazon.com/Order-Statistics-Inference-Estimation-Methods/dp/149330738X

rahul
  • 124
  • 8
1

Comment: @spaceisdarkgreen, I simulated the case $n = 10,\, \theta = 0$ to see if this seems to work at all and to confirm your correction of the degrees of freedom. The red curve is for $\mathsf{Chisq}(n-2)$ and the green for $\mathsf{Chisq}(2n-2).$ Of course, this doesn't prove anything, but (to me anyhow) it offers hope your argument might be made rigorous.

enter image description here

R code in case it is of any use:

m = 10^5;  n = 10
x = rexp(m*n);  MAT = matrix(x, nrow=m)
t = rowSums(MAT);  v = apply(MAT, 1, min)
y = 2*t - 2*n*v
hist(y, prob=T, br= 25, col="skyblue2", ylim=c(0,.12))
 curve(dchisq(x, n-2), 0, 50, lwd=2, col="red", add=T)
 curve(dchisq(x, 2*n-2), lwd=2, col="darkgreen", add=T)
BruceET
  • 51,500
  • 1
    Let $J = \text{the index $j\in{1,\ldots,n}$ for which } X_j = X_{(1)}.$ Then $J$ is uniformly distributed in ${1,\ldots,n}$. Then we have $$ \sum_{i=1}^n \left(X_{(i)}- X_{(1)} \right) = \sum_{i=1}^n (X_i - X_J) \quad (\text{Note that one of the $n$ terms in this last sum is $0$.}) $$ And $$ \Pr\left( \sum_{i=1}^n (X_i-X_J) \in A \right) = \operatorname E\left(\Pr\left( \sum_{i=1}^n (X_i-X_J) \in A \right) \mid J \right) $$ and this last probability does not depend on the value of $J.$ Since it does not depend on $J,$ this expected value is equal to $\qquad$ – Michael Hardy Nov 29 '17 at 06:25
  • 1
    its conditional probability given that $J=1,$ i.e. to $$ \Pr\left( \sum_{i=2}^n (X_i - X_1) \in A \mid J=1 \right). $$ By memorylessness of the exponential distribution, the distribution of $X_i-X_1$ given that $X_1<X_i$ is the same exponential distribution as that of $X_i.$ – Michael Hardy Nov 29 '17 at 06:25
  • @michaelhardy, yes this is the part I was stuck on too (see my comments above). Seems "obviously" true since all you know is that they're all greater than some value... knowing one doesn't give info about the others. Yet couldn't see how to make this rigorous. – spaceisdarkgreen Nov 29 '17 at 16:53
0

First try to show that $$X_i - X_{(1)} \sim \begin{cases} \delta_0 & \text{if } X_i = X_{(1)}, \\ \chi^2_2 & \text{otherwise}. \end{cases}$$ Then think about how to show that $X_i-X_{(1)},\, X_j-X_{(1)}$ are independent. Then use standard properties of the chi-square distribution.