A "why" question is pretty ill-defined but I will try to answer anyways.
Let's assume we have a random variable taking values in $\mathbb{R}$. The basic idea is that what we call the variance should measure somehow how much our the random variable varies. One way to do this is to fix some point in $\mathbb{R}$, call it $m$, and see what our expected difference is from this value,
$$\mathbb{E}(x-m)$$
but this is not a good idea because we are not weighing being below $m$ or being above $m$ the same way, even if we are off by equal amounts. So, we could try to square things,
$$\mathbb{E}((x-m)^2)$$
in which case we can minimize this variance by taking $m$ to be the mean of $x$. You might wonder why not take
$$\mathbb{E}(|x-m|)$$
and it's possible and reasonable to do so, in this case the constant $m$ which minimizes the variance above is the median and no longer the mean. If we already know we are interested in the mean, for whatever reason, then it makes the most sense to choose to define the variance in terms of the square error. This seems to be what the explanation you saw suggests, they are assuming we are interested in the mean already and thus justify their choice of a measure of variance as being one which is minimized by the mean.