I am trying to learn how to derive the formula for the "distribution of the sample variance" from first principles - regardless of the underlying probability distribution for $X$.
Part 1: In general, for some random variable $X$, we can write the variance of $X$ as:
$$Var(X) = E(X^2) - [E(X)]^2$$
where:
$$E(X) = \int x f(x) dx$$
Part 2: In general, we can write the formula for the "sample variance" for any random variable $X$ as:
$$S^2_x = \frac{\sum (X_i - \bar{x})^2}{n-1}$$
This means that we are now required to determine:
$$Var(S^2_x) = E(S^2_x) - [E(S^2_x)]^2$$
Part 3: I started evaluating $[E(S^2_x)]^2$
First, I tried to re-arrange this expression prior to taking the Expected Value:
$$S^2 = \frac{\sum (x_i - \bar{x})^2}{n-1} = \frac{1}{n-1} \left( \sum x_i^2 + n\bar{x}^2 - 2\bar{x}\sum x_i \right) = \frac{1}{n-1} \left( \sum x_i^2 + n\bar{x}^2 - 2n\bar{x}^2 \right) = \frac{1}{n-1} \left( \sum x_i^2 - n\bar{x}^2 \right)$$
Then, if we remember the following relationships:
$$\begin{align*} \operatorname{Var}(X_i) &= \mathbb{E}(X_i^2) - (\mathbb{E}(X_i))^2 \\ \sigma^2 &= \mathbb{E}(x_i^2) - \mu^2 \\ \mathbb{E}(X_i^2) &= \sigma^2 + \mu^2 \end{align*}$$
$$\begin{align*} \operatorname{Var}(\bar{x}) &= \mathbb{E}(\bar{x}^2) - (\mathbb{E}(\bar{x}))^2 \\ \frac{\sigma^2}{n} &= \mathbb{E}(x^2) - \mu^2 \\ \mathbb{E}(\bar{x}^2) &= \frac{\sigma^2}{n} + \mu^2 \end{align*}$$
We can resume taking the Expected Value of $S^2$:
$$\begin{align*} \mathbb{E}(S^2) &= \frac{1}{n-1} \left( \mathbb{E}\left(\sum X_i^2\right) - n\mathbb{E}(\bar{x}^2) \right) \\ &= \frac{1}{n-1} \left( n(\sigma^2 + \mu^2) - n \left(\frac{\sigma^2}{n} + \mu^2\right) \right) \\ &= \frac{1}{n-1} (\sigma^2(n-1)) \\ &= \sigma^2 \end{align*}$$
Note that this is a well-known fact in Probability Theory : $S^2$ is an unbiased estimator of $\sigma^2$ , i.e. $E(S^2) = \sigma^2$.
I know I can expand this as:
Part 4: Now, I need to take $[E(S^2_x)]^2$ - this is where I get stuck:
$$\begin{align*} (S^2_x)^2 &= \left(\frac{\sum (X_i - \bar{x})^2}{n-1}\right)^2 \\ &= \frac{1}{(n-1)^2} \left( \sum X_i^2 + n\bar{x}^2 - 2\bar{x}\sum X_i \right)^2\end{align*}$$
However, I am not sure how to take the Expected Values of the terms in the above expression.
Can someone please help me continue this derivation?
Thanks!
References:
- I found this link here Variance of sample variance? but the answers provided here do not show me how to proceed from the approach I am currently using