2

Suppose I have $n$ sets of empirical data, each with Gaussian noise with unit variance $\sigma^2=1$, and each containing $\nu$ points. I fit some model to each dataset, and find that the sums of the squared errors are distributed according to a $\chi^2$ distribution. Although every dataset gives an expected total squared error of $\nu\sigma^2$, by chance some will have a lower total squared error. I wonder what the expected lowest total square error is, but I am unable to work out the integrals for $n>2$.

The $\chi^2$ distribution with $\nu$ degrees of freedom is $$\operatorname{PDF}(x)=\frac{1}{2^{\nu/2}\Gamma(\nu/2)}x^{\nu/2-1}\exp(-x/2)$$ The expected value of the minimum of two samples from a $\chi^2$ distribution is

\begin{align} & \int_0^\infty \int_0^\infty \min(x,y) \operatorname{PDF}(x)\operatorname{PDF}(y) \,dx \,dy \\[10pt] = {} & 2 \int_0^\infty \int_0^y x \operatorname{PDF}(x)\operatorname{PDF}(y) \,dx \,dy=\nu -\frac{2 \Gamma \left(\frac{\nu +1}{2}\right)}{\sqrt{\pi} \Gamma \left(\frac{\nu }{2}\right)} \end{align} I was unable to find a closed form for the integral for three samples.

Is there a closed-form formula for $$\int_0^\infty\cdots\int_0^\infty \min(x_1,\ldots,x_n)\operatorname{PDF}(x_1)\cdots \operatorname{PDF}(x_n) \,dx_1\cdots dx_n\text{?}$$

Wouter
  • 7,673
  • You are using the word "sample" incorrectly. Your "two samples" are in fact one sample of size $2$; these are two observations in a sample. $\qquad$ – Michael Hardy Jul 02 '18 at 17:14
  • 1
    $$ \begin{align} & \Pr(\min{X_1,\ldots,X_n} \le x) = 1-\Pr( \min{X_1,\ldots,X_n} > x) \ \ = {} & 1-\Pr(X_1> x\ &\ \cdots\ &\ X_n>x) = 1-\Pr(X_1>x)\cdots\Pr(X_n>x) \ \ = {} & 1-\big( \Pr(X_1>x)\big)^n = 1 - \left( \int_x^\infty \operatorname{PDF}(u) , du \right)^n. \end{align} $$ Differentiating both sides gives the density function for the minimum. $\qquad$ – Michael Hardy Jul 02 '18 at 17:28

1 Answers1

2

I can only offer a partial solution. Since $$P(\min x_i \le x)=1-(1-F(x))^n$$ with $F$ the $\chi_\nu^2$ CDF, $\min x_i$ has pdf $$n(1-F(x))^{n-1}f(x),$$so the mean you seek is $$\mu:=\int_0^\infty n(1-F(x))^{n-1}xf(x)dx.$$Use integration by parts with $u=x,\,v=-(1-F)^n$ so$$\mu=\int_0^\infty (1-F)^n dx$$(proof that the boundary term vanishes is left as an exercise). But for odd $\nu$ we don't even have an elementary expression for $F$, so I doubt what you're asking for is possible.

J.G.
  • 115,835