0

I was reading this question (Variance of sample variance?) and came across this answer (https://math.stackexchange.com/a/72978/791334) on how to calculate the Variance of the Sample Variance:

Suppose the samples are taking from a normal distribution. Then using the fact that $\frac{(n-1)S^2}{\sigma^2}$ is a chi squared random variable with $(n-1)$ degrees of freedom, we get $$\begin{align*} \text{Var}~\frac{(n-1)S^2}{\sigma^2} & = \text{Var}~\chi^{2}_{n-1} \\ \frac{(n-1)^2}{\sigma^4}\text{Var}~S^2 & = 2(n-1) \\ \text{Var}~S^2 & = \frac{2(n-1)\sigma^4}{(n-1)^2}\\ & = \frac{2\sigma^4}{(n-1)}, \end{align*}$$

where we have used that fact that $\text{Var}~\chi^{2}_{n-1}=2(n-1)$.

My Question: In the above formula, $\begin{align*} & \frac{2\sigma^4}{(n-1)}\end{align*}$ , the population $\sigma$ is written. However, in a real-world case, we will only have access to sample $s$.

Thus, I was wondering if it is possible to write the above formula while replacing $\sigma$ with $\hat{s}$ :

$$\begin{align*} \text{Var}~S^2 & = \frac{2(n-1)\hat{s}^4}{(n-1)^2}\\ & = \frac{2\hat{s}^4}{(n-1)}, \end{align*}$$

Where: $\hat{s}^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2$ .

Is this correct?

Thanks!

stats_noob
  • 3,112
  • 4
  • 10
  • 36

1 Answers1

1

Context here is key - in a general setting, $\text{Var}~S^2$ is a theoretical value describing the chi squared distribution, where $\hat s$ will be a statistic, based on the real world as you mention, and so subject to all the usual inaccuracies of sampling (which is to be expected for a random process).

Of course if you're trying to estimate this value then this replacement will be more or less useful, depending on your needs. But again, depending on how you'll be using the data, this may be sufficient, but it will not in general be equal to the original variance.

George
  • 757
  • 3
  • 15
  • @ George: Thank you for your answer! I was just wondering if it is "mathematically valid" to replace $\sigma$ with $\hat{s}$ ? Or if any other modifications had to be made (e.g. correction factor) prior to this? – stats_noob Jun 03 '23 at 20:53
  • Ah - do you mean correction factors specifically like a potential $\dfrac{n}{n-1}$ or the such to get better convergence? I think any correction factors will become negligible as the number of samples increases, but again I think more context is needed to know precisely. – George Jun 03 '23 at 21:08
  • 1
    @ George: Thank you for your reply! Suppose I observe the heights of 100 students out of a population of 1000 students. I know how to estimate the "sample mean", the "sample variance" and the "variance of the sample mean" .... but I don't know how to estimate the "variance of the sample variance". I am interested in learning how to estimate the "variance of the sample variance". – stats_noob Jun 03 '23 at 21:18
  • Does your answer correspond to this information I just provided? – stats_noob Jun 03 '23 at 21:19
  • Hi, @stats_noob. I believe George is correct. The derivations assume knowledge of the true variance and thus allow for the various algebraic expansions and expectations. In your case, the true variance is unknown. You have an estimate of the true variance: the sample variance. So your best estimate of the variance of of the sample variance has you using your estimate of the true population variance in for the value you need. Empirical statistics is pretty much all estimates. Unless you observe EVERYTHING (like a six-sided die) you can never capture the ultimate truth. – Avraham Jun 04 '23 at 21:36