2

We've noticed a detail in our program that we don't directly know how to prove.

Take two 'random' vectors x and y with a large dimension d. Both vectors are created by generating d random numbers between $-1/2$ and $1/2$ and afterwards normalizing the vector.

We've noticed that $\| x + y \| \approx \sqrt2 $. Can we prove this easily?

Daan Seuntjens
  • 260
  • 4
  • 15
  • What does "normalizing the vector" mean? – lulu Apr 11 '22 at 15:55
  • 1
    @lulu Usually, dividing by the norm to get a norm-$1$ vector. – Jean-Claude Arbaut Apr 11 '22 at 16:12
  • 5
    You should also notice that $x\cdot y\approx 0$, hence $|x+y|^2=|x|^2+|y|^2+2x\cdot y\approx2$. Have a look at https://mathoverflow.net/questions/248466/why-are-two-random-vectors-in-mathbb-rn-approximately-orthogonal-for-large and https://math.stackexchange.com/questions/995623/why-are-randomly-drawn-vectors-nearly-perpendicular-in-high-dimensions. – Jean-Claude Arbaut Apr 11 '22 at 16:21
  • Note that the distribution you have chosen is a non-uniform distribution over a unit sphere. It will have $2^d$ "peaks" corresponding to the vertices of the $d$-cube. I think this is a significant effect in lower dimensions, but I don't know how much it is in higher dimensions. – David K Apr 11 '22 at 16:38
  • 1
    We often see something similar in telecommunication applications. We easily see that $E(x_i^2)=1/12=E(y_i^2)$. With $d$ very large (in some applications $d$ will easily be $\approx 10^4$), the relative standard deviation of $||x||$ is almost negligible, and the "normalization" amounts to dividing by "constant" $\sqrt{d/12}$. On the other hand the expected value of the inner product is zero, and the calculation @Jean-ClaudeArbaut gave leads to your conclusion. – Jyrki Lahtonen Apr 11 '22 at 17:17
  • (cont'd) True, in the sims I used to run, the components were gaussian rather than uniform, but the law of large numbers is really on your side. In fact, I would make quick and dirtty test runs of my random gaussian generator simply by summing the squares of a few thousand random numbers, and checking that the norm is as expected. – Jyrki Lahtonen Apr 11 '22 at 17:20

0 Answers0