7

The biased MLE of Normal distribution is:
$\hat{\sigma }_{MLE} = \frac{1}{N}\sum_{N}^{i=1}\left({x}_{i} - \hat{\mu }\right)^{2}$

And unbiased is:

$\hat{\sigma }_{unbiased} = \frac{1}{N-1}\sum_{N}^{i=1}\left({x}_{i} - \hat{\mu }\right)^{2}$

So Why the former is N and later is N-1?

Wilbeibi
  • 183

1 Answers1

12

If the elements of the sample are statistically independent, then ($\mu$ denotes the population mean and $\sigma^2$ the population variance):

\begin{align} \mathbb{E}[\widehat{\sigma}_{\text{unbiased}}^2] & = \mathbb E\left[ \frac 1{N-1} \sum_{i=1}^N \left(x_i - \frac 1N \sum_{j=1}^N x_j \right)^2 \right] \\ & = \frac 1{N-1} \sum_{i=1}^N \mathbb E\left[ x_i^2 - \frac 2N x_i \sum_{j=1}^N x_j + \frac{1}{N^2} \sum_{j=1}^N x_j \sum_{k=1}^N x_k \right] \\ & = \frac 1{N-1} \sum_{i=1}^N \left[ \frac{N-2}{N} \mathbb E[x_i^2] - \frac 2N \sum_{j \neq i} \mathbb E[x_i x_j] + \frac{1}{N^2} \sum_{j=1}^N \sum_{k \neq j} \mathbb E[x_j x_k] +\frac{1}{N^2} \sum_{j=1}^N \mathbb E[x_j^2] \right] \\ & = \frac 1{N-1} \sum_{i=1}^N \left[ \frac{N-2}{N} (\sigma^2+\mu^2) - \frac 2N (N-1) \mu^2 + \frac{1}{N^2} N (N-1) \mu^2 + \frac {1}{N} (\sigma^2+\mu^2) \right] \\ & = \sigma^2. \end{align}

Correspondingly, \begin{align} \mathbb{E}[\widehat{\sigma}_{\text{MLE}}^2]=\mathbb{E}\left[\frac{N-1}{N}\widehat{\sigma}_{\text{unbiased}}^2\right]=\frac{N-1}{N}\sigma^2<\sigma^2. \end{align} Therefore, the maximum likelihood estimator of the variance is biased downward. Source and more info: Wikipedia.

triple_sec
  • 23,377
  • 4
    FYI Your answer is being used as reference in a Coursera course https://www.coursera.org/learn/competitive-data-science Cheers :) – Hack-R Jun 03 '18 at 01:26
  • 1
    @Hack-R Thanks for the heads-up, good to know. I took the opportunity to fix a small typo. :-) – triple_sec Jun 04 '18 at 04:30