Suppose two estimators are unbiased, what is the intuition behind the preference of the estimator with the less variance?

Question

Suppose there are two unbiased estimators that we can use to estimate a parameter $\theta$, why do we often prefer the one with less asymptotic variance?

The question is rather simple and perhaps obvious but I cannot seem to convince myself totally. One thing I thought about is that, say $p(X)$ is the estimator with less variance, then for different sets of data, $p(x)$ will stay relatively 'stable' in comparison to the other estimator. So if we use it to construct a confidence interval then the interval length will be relatively short and so it gives a better idea of where the parameter $\theta$ would lie?

Are there better explanations to this? Many thanks in advance!

Variance is a measure that expresses how much samples deviate from the expected value. If your estimator is unbiased then the expected value is equal to the true value that you want to estimate, so in that case variance tells you how much you deviate from the true value. — Jens Renders, Jun 09 '20 at 21:57

score 0 · Accepted Answer · answered Jun 09 '20 at 23:36

Here is a practical example. Two unbiased estimators for the mean $\mu$ of a normal population are the sample mean $A$ and the sample median $H.$ (See here for unbiasedness of the sample median of normal data.) That is, $E(A) = E(H) = \mu.$

However, for any one particular sample size $n \ge 2$ one has $Var(A) < Var(H),$ so it the sample mean is the preferable estimator.

In particular, if we are trying to estimate $\mu$ with $n = 10$ observations from a normal population with $\sigma=1,$ then it is easy to see that $Var(A_{10}) = 0.1.$ By simulation (and other methods) one can find that $Var(H_{10}) \approx 0.138.$

Therefore, if we were to insist on using the median rather than the mean we would have to use more than ten observations to get the same degree of precision of estimation we could get from the mean.

set.seed(2020)
h = replicate(10^6, median(rnorm(10)))
mean(h);  var(h)
[1] 0.000159509      # aprx E(H) = 0
[1] 0.1384345        # aprx Var(H) > 0.1

Here is a histogram of sample medians of a million samples of size $n=10.$ The solid red curve shows the density function of the normal distribution of means of samples of size $n=10,$ which is $\mathsf{Norm}(\mu = 0, \sigma = 1/\sqrt{10}).$ [There is also a Central Limit Theorem for sample medians that ensures the histogram is very nearly normal--but with a larger variance.]

hist(h, prob=T, br=50, col="skyblue2", 
     main="n=10: Histogram of Sample Medians")
 curve(rnorm(x, 0, 1/sqrt(10)), add=T, col="red", lwd=2)

Suppose two estimators are unbiased, what is the intuition behind the preference of the estimator with the less variance?

1 Answers1