0

In the case of the normal distribution, I know that the standard deviation tells me that 78% of my sample is in the interval $[\mu - \sigma, \mu + \sigma]$.

Suppose I have another sample which is not normally distributed. Is there any valuable information I get from knowing the standard deviation of that sample?

2 Answers2

0

Rigorously the answer is "not really". One has Chebyshev's inequality which says $P(|X-\mu| \geq k\sigma) \leq \frac{1}{k^2}$, but this inequality is incredibly weak for many common distributions including the normal distribution, especially for large $k$. But it is tight for a particular family of discrete distributions, so it is the best you can do in full generality.

Knowing more about the distribution other than just "its standard deviation exists" lets you do more than this, but then it depends on the domain of application etc.

Ian
  • 101,645
  • Also, Checbyshev's inequality does little to justify the definition of standard deviation, since you would game similar inequalities for other possible definitions like, say, $e^{|X-\mu|^3}$. – Jack M Mar 15 '19 at 21:44
0

The short answer is that many questions' answers make use of $\sigma$ even though, because other things also matter, that's not enough on its own.

Suppose $X$ has finite mean $\mu$ an finite standard deviation $\sigma>0$, so it's $Z$-score $Z:=\frac{X-\mu}{\sigma}$ has mean $0$ and variance $1$. Sure, we can't technically know anything more about $Z$'s distribution in general (save for results such as the aforementioned Chebyshev's inequality), which is somewhat disappointing. But $Z$ is still of interest because quantities such as $\Bbb E Z^3,\,\Bbb E Z^4$, respectively called skewness and kurtosis, unlock further secrets about $X$'s distribution. We can't get such details from $\sigma$, but we'll need it to even talk about $Z$.

Further, as I explained here, variances and more generally covariances form an inner product on variables (give or take some footnotes and asterisks I run through properly there). This observation translates correlation into a geometric intuition familiar to anyone who's ever expressed the cosine of an angle in terms of dot products. Now, correlation doesn't prove causation, but it's an important quantity to consider when trying to learn what might be going on between variables.

Finally, let's not forget that knowing means and variance of some variables can in the right circumstances get us something for which a Normal approximation is valid and useful, such as in the central limit theorem for sample means, or the delta method's analysis of estimators.

J.G.
  • 115,835