In a example about U-statistics, $h(x_1,x_2)=\frac 12(x_1-x_2)^2$, then $$U_n=\frac{2}{n(n-1)}\sum_{i<j}\frac{(X_i-X_j)^2}{2}=\frac{1}{n-1}\sum_{i=1}^{n}(X_i-\bar{X})^2$$ I don't know how to prove it completely.
-
Please show us what you have tried so far. – Stockfish Dec 18 '18 at 11:58
-
1I think the formula $\sum_{i<j}(x_j+x_j)^2=(n-1)\sum_{i=1}^{n}x_i$ will be help. But I don't know how to prove the above formula. – chole Dec 18 '18 at 12:14
3 Answers
We know that (I found it here) \begin{equation} \left( \sum_{n=1}^N a_n \right)^2 = \sum_{n=1}^N a_n^2 + 2 \sum_{j=1}^{N}\sum_{i=1}^{j-1} a_i a_j \end{equation} So using the above identity
\begin{align} \sum_{i=1}^{n}(X_i-\bar{X})^2 &= \sum_{i=1}^{n}(X_i-\frac{1}{n}\sum_{j=1}^nX_j)^2\\ &= \sum_{i=1}^{n}(X_i^2-\frac{2}{n}X_i\sum_{j=1}^nX_j + \frac{1}{n^2}(\sum_{j=1}^nX_j)^2 )\\ &= \sum_{i=1}^{n}(X_i^2-\frac{2}{n}X_i\sum_{j=1}^nX_j + \frac{1}{n^2}(\sum_{j=1}^nX_j^2 + 2\sum_{j=1}^n\sum_{k=1}^{j-1}X_jX_k) ) \end{align} The last term above is independent of $i$ so it sums up $n$ times as \begin{align} \sum_{i=1}^{n}(X_i-\bar{X})^2 &= \sum_{i=1}^{n}(X_i^2-\frac{2}{n}X_i\sum_{j=1}^nX_j) + \frac{n}{n^2}(\sum_{j=1}^nX_j^2 + 2\sum_{j=1}^n\sum_{k=1}^{j-1}X_jX_k) \end{align} which is also \begin{align} \sum_{i=1}^{n}(X_i-\bar{X})^2 &= \sum_{i=1}^{n}(X_i^2-\frac{2}{n}X_i\sum_{j=1}^nX_j) + \frac{1}{n}(\sum_{j=1}^nX_j^2 + 2\sum_{j=1}^n\sum_{k=1}^{j-1}X_jX_k) \end{align} which could also be written as \begin{align} \sum_{i=1}^{n}(X_i-\bar{X})^2 &= (1 + \frac{1}{n}) \sum_{i=1}^{n}X_i^2-\frac{2}{n}\sum_{i=1}^{n}X_i\sum_{j=1}^nX_j) + \frac{1}{n}( 2\sum_{j=1}^n\sum_{k=1}^{j-1}X_jX_k) \end{align} Rewriting differently we have \begin{align} \sum_{i=1}^{n}(X_i-\bar{X})^2 &= (1 + \frac{1}{n}) \sum_{i=1}^{n}X_i^2-\frac{2}{n}\sum_{i,j}X_iX_j + \frac{2}{n}\sum_{i<j}X_iX_j \end{align} The last two terms above are the same terms with missing terms. Notice that $\sum_{i,j}X_iX_j$ spans all $i = 1 \ldots n$ and $j = 1 \ldots n$ but the other one spans an upper triangular version of it. This means that their difference will span the lower triangular version of it as \begin{align} \sum_{i=1}^{n}(X_i-\bar{X})^2 &= (1 + \frac{1}{n}) \sum_{i=1}^{n}X_i^2 - \frac{2}{n}\sum_{i\geq j}X_iX_j \end{align} Factor $n$ on the right hand side, then divide by $n-1$ on both sides, then Multiply/divide by $2$ on the right hand side \begin{align} \frac{1}{n-1} \sum_{i=1}^{n}(X_i-\bar{X})^2 &= \frac{2}{n(n-1)} \Big( \frac{(n + 1) \sum_{i=1}^{n}X_i^2 - 2\sum_{i\geq j}X_iX_j}{2} \Big) \end{align} Notice that $i \geq j$ could be split to two summations \begin{align} \frac{1}{n-1} \sum_{i=1}^{n}(X_i-\bar{X})^2 &= \frac{2}{n(n-1)} \Big( \frac{(n + 1) \sum_{i=1}^{n}X_i^2 - 2\sum_{i = j}X_iX_j - 2\sum_{i > j}X_iX_j}{2} \Big) \end{align} but when $i = j$, it is the same as a single summation, hence \begin{align} \frac{1}{n-1} \sum_{i=1}^{n}(X_i-\bar{X})^2 &= \frac{2}{n(n-1)} \Big( \frac{(n + 1) \sum_{i=1}^{n}X_i^2 - 2\sum_{i=1}^n X_i^2 - 2\sum_{i > j}X_iX_j}{2} \Big) \end{align} which gives \begin{align} \frac{1}{n-1} \sum_{i=1}^{n}(X_i-\bar{X})^2 &= \frac{2}{n(n-1)} \Big( \frac{(n -1) \sum_{i=1}^{n}X_i^2- 2\sum_{i > j}X_iX_j}{2} \Big) \end{align} The numerator above is nothing other than $\sum_{i<j} (X_i - X_j)^2 = \sum_{i<j} X_i^2 - 2 \sum_{i<j} X_iX_j + \sum_{i<j} X_j^2$. It is easy to see the cross terms, however it is not as straightforward to see that we have $n-1$ terms of the form $X_i^2$. This should conclude \begin{align} \frac{1}{n-1}\sum_{i=1}^{n}(X_i-\bar{X})^2 = \frac{2}{n(n-1)}\sum_{i<j}\frac{(X_i-X_j)^2}{2} \end{align}

- 12,076
A one-line proof summary:$$\sum_{i<j}(X_i-X_j)^2=\frac{1}{2}\sum_{ij}(X_i-X_j)^2=n\sum_iX_i^2-\sum_{ij}X_iX_j=n\sum_i X_i(X_i-\overline{X})=n\sum_i(X_i-\overline{X})^2.$$The first $=$ uses the fact that $(X_i-X_j)^2$ is $i\leftrightarrow j$-symmetric and $0$ if $i=j$. The second $=$ expands the square and separates squares from cross terms. The third $=$ is a trivial rearrangement. The last $=$ uses $$X_i(X_i-\overline{X})-(X_i-\overline{X})^2=\overline{X}(X_i-\overline{X}),$$which becomes $0$ under $\sum_i$.

- 115,835
Hint 1 : $\sum_{i<j}{(X_i-X_j)^2} = \frac{1}{2}\sum_{i}\sum_{j}(X_i-X_j)^2$
Hint 2 : Add and subtract $\bar{X} $ to simplify sum of squares.
You will arrive at your result.

- 627
- 3
- 9