Let $X_{i}\sim N(\mu_{i} ,\sigma^{2}_{i}), 1 \leq i \leq n$, denote $n$ normally distributed independent random variables. I want to show that $$\sum_{i=1}^n X_i \sim N\left(\sum_{i=1}^n\mu_{i},\sum_{i=1}^n\sigma_{i}^{2}\right),$$ without using convolution integrals or characteristic functions.
-
1To note your notation is confused, first you use variance $\sigma_i^2$ and then standard deviation $\sigma_i$. Then your claim is wrong if the random variables are not independent. Why to show it without characteristic functions? – Karl Jan 01 '15 at 13:20
-
1I find his notation strangely artificial. Seriously, why not simply write $\sum_i X_i$ ? – Raskolnikov Jan 01 '15 at 13:25
-
I corrected those errors. I want to know if there are other methods of proof. – nabil Jan 01 '15 at 13:25
-
Other methods than characyeristic function? Showit by direct calculation, using the convolution formula, and maybe induction. – kjetil b halvorsen Jan 01 '15 at 14:06
2 Answers
You could use the fact that normal distributions are determined by their moments, meaning that if $X\sim N(\mu,\sigma^2)$ and $\mathbb E[X^k]=\mathbb E[Y^k]$ for every $k\in\mathbb N$, then $Y\sim N(\mu,\sigma^2)$.
Looking at the simple example $X_1+X_2$ with $X_1\sim N(\mu_1,\sigma_1^2)$ and $X_2\sim N(\mu_2,\sigma_2^2),$ for every $k\in\mathbb N$, one could show (using the binomial theorem in the second step) \begin{align} \mathbb E\Big[\big((X_1+X_2)-(\mu_1+\mu_2)\big)^k\Big]&=\mathbb E\Big[\big((X_1-\mu_1)+(X_2-\mu_2)\big)^k\Big]\\ &=\mathbb E\left[{k\choose t}\sum_{t=0}^k(X_1-\mu_1)^t(X_2-\mu_2)^{k-t}\right]\\ &\cdots\text{ (additional work)}\\ &=\begin{cases} 0&\text{if $k$ is odd; and}\\ (\sigma_1+\sigma_2)^k(k-1)!!&\text{if $k$ is even.} \end{cases} \end{align} (Where "!!" denotes the double factorial.) This implies that $(X_1+X_2)-(\mu_1+\mu_2)\sim N(0,\sigma_1^2+\sigma_2^2)$, i.e., $X_1+X_2\sim N(\mu_1+\mu_2,\sigma_1^2+\sigma_2^2)$.
The rest follows from induction.
EDIT: as @ki3i correctly notes in his comment, the "standard" proof of the fact that normal distributions are determined by their moments uses the moment generating function. In fact, I can't think of a way to prove this without using the moment generating function at this moment.
If you also disallow using the moment generating function in disguise, which wouldn't be surprising if you disallow the use of characteristic functions, you could compute the CDF of $X_1+X_2$ by $$F_{X_1+X_2}(z)=\int\int_{[x+y\leq z]}f_{X_1}(x)f_{X_2}(y)~dxdy,$$ where $f_{X_1}$ and $f_{X_2}$ are the density functions of $X_1$ and $X_2$ respectively.
The solution to this, which can be a little tedious and involves a bit of trickery, is written in detail on the wikipedia article Sum of normally distributed random variables (see the section Geometric proof) as well as this mathematics magazine article (see "the rotation proof" section).
-
@PierreYvesGaudreauLai , This looks nice. But, why isn't this an application of "characteristic functions/Moment Generating functions" in disguise? – ki3i Jan 01 '15 at 15:32
-
@PierreYvesGaudreauLai , Thanks for addressing my concern and the helpful references. – ki3i Jan 01 '15 at 19:12
The means part is straightforward to show using the linearity of expectation; no convolutions or characteristic functions are needed. So, for ease in calculation, we can take the $X_i$ to be zero-mean normal random variables and just show that $\sum_i X_i \sim N\left(0,\sum_i \sigma_i^2\right)$.
Suppose $X$ and $Y$ are independent standard normal random variables. Then, as described in this answer of mine, $\alpha X + \beta Y$ is a zero-mean normal random variable with variance $\alpha^2+\beta^2$. This proof is based purely on a rotation of axes; no convolutions or characteristic functions are involved. But, $\alpha X$ and $\beta Y$ are zero-mean normal random variables with variances $\alpha^2$ and $\beta^2$ respectively. So, we have that $X_1+X_2 \sim N(0,\sigma_1^2+\sigma_2^2)$, and as Pierre said, the general result $\sum_i X_i \sim N\left(0,\sum_i \sigma_i^2\right)$ follows by induction. Look, Ma! No convolutions and no characteristic functions.

- 25,197
-
:-) Very nice use of symmetry here: especially the choice of coordinate through the origin, parallel to the plane of interest! I find this quite enlightening. – ki3i Jan 01 '15 at 19:14
-