Mean distance between 2 points in a ball

Question

I have found an answer on this site to the question of determining the mean straight-line distance between 2 randomly chosen points in a disc of radius r. (See Average distance between two points in a circular disk) I'm now trying to find an answer to the same question except involving a ball of radius r rather than a disc. Any guidance on this question would be appreciated.

Most of the times (i am pretty sure in this case too) these proofs generalize straight forward to higher dimensions. Can you point out your solution for the disk? Do you understand it and did you already try on your own to generalize it? — Listing, Jul 07 '12 at 17:56
It should be much easier on a sphere: Due to symmetry, you may as well keep one point fixed, say the south pole of the sphere. Rescaling so the radius is 1 (just multiply the answer by $r$ afterwards) and assuming the sphere is centered at the origin, the distance from a given point on the sphere to the south pole will be $\sqrt{x^2+y^2+(z+1)^2}=\sqrt{2+2z}$. This should be straightforward to integrate over the sphere. (I just now noted the word “within” in the question title. Are you looking at points in the ball rather than the sphere?) — Harald Hanche-Olsen, Jul 07 '12 at 17:57
@HaraldHanche-Olsen. I've found solutions to similar questions involving points on a circle and points on the surface of a sphere. But now my question involves points within a sphere. Yes, sorry, the ball might be the better word. — Brian, Jul 07 '12 at 18:01
@Listing, if the random simplex volume problem is an indicator, there's no easy way to generalize to higher dimensions. For example already the expression for the area of a random triangle (ie, with vertices independently and uniformly distributed in the unit square) is significantly more complex than the length of a random line segment in [0,1], to say nothing about the volume of a random tetrahedron. — alancalvitti, Jul 07 '12 at 18:18
Using the two points and the center of the ball, you could form a plane on which is a disc with the two points in question and thus reduces to the same problem. Though there's the issue if the 3 points are co-linear... kind of an edge case to iron out EDIT: actually just choose any 3rd point in ball such that its not co-linear with the 2 random points of interest, from that you can get your plane, don't have to use the center... — Russ, Jul 07 '12 at 18:22
@Listing. The paper I was ultimately referred to was at link but the proof was given in yet another reference I haven't yet been able to track down. — Brian, Jul 07 '12 at 19:20
You can actually solve this ball picking problem for any n-dimensional hypersphere. Due to symmetry, no matter how many dimensions you use, you can reduce the problem to a double integral, and using some clever substitutions and the Beta function, you’ll find that the mean distance between all sets of two points in an n-dimensional hypersphere is precisely: $\frac{2^{n+1}n(\frac{n}{2})!²}{\sqrt{π}(n+\frac{1}{2})!(n+1)}r$ — Math Machine, Dec 27 '19 at 17:51
@Eric (Part 1) Unfortunately, no. This is personal research. And, I gotta be honest, it's really hard writing proofs on Stack Exchange. This problem is known as the "Ball Line Picking Problem", though, in case you wanna do any more research. If you wanna try it out for yourself, I recommend having point 1 be a part of an n-dimensional spherical coordinate system, and having point 2 use the same coordinates, but with point 1 at the origin. That way, you don't have to use any √s to solve for the distance (though you will need √s to solve for the bounds). — Math Machine, Oct 27 '21 at 18:01
(Part 2) Here are details on how to construct multidimensional spherical coordinates https://math.stackexchange.com/questions/56582/analogue-of-spherical-coordinates-in-n-dimensions. The Jacobian would be R^(n-1)sin^(n-2)(Θ1)sin^(n-3)(Θ2)...sin^2(Θ(n-3))*sin(Θ(n-2)). Also, if you're super curious, the k-th non-central moment of the probability distribution for distances between 2 pts on an n-ball would be 2^(n+k)(n/2)!((n+k-1)/2)!nr/(√(π)(n+k)(n+k/2)!). That's about all I can give for now. But if you really wanna see the proof, just tell me, and I'll post it here. Peace! — Math Machine, Oct 27 '21 at 18:11
@MathMachine Thanks for the detail. I'm wondering abouth this: for n dimensional line picking problem, is n-ball the most "economical" shape, in the sense that line picking in any other shape with the same volumn will yield a mean distance greater than the ball will? — Eric, Oct 28 '21 at 08:22
@Eric (Part 1) Um….Maybe? I think it is. Here’s a reasoning for why it’s probably the optimal shape, but I should preface by saying what I’m about to say is, by no means whatsoever, mathematically rigorous. Let’s play a game. You’re given a lump of clay. You’re not allowed to change the shape of the lump, but you are allowed to add more clay to it. Every 10 seconds, your friend gives you an infinitesimally small amount of clay, and asks you to place it on the lump. However, he doesn’t want you to place it just anywhere. — Math Machine, Oct 28 '21 at 23:20
(Part 2) He very specifically asks that you place it in the one spot such that the new, slightly larger lump will have the lowest possible line-picking average after placement. You tell your friend that that’s impossible. Every time you add another piece, the line-picking average will increase slightly. He tells you that’s okay, just as long as the line-picking average is the lowest it could possibly be after adding that tiny piece. You continue doing this, every 10 seconds placing another infinitesimally small piece of clay in a very precise location. — Math Machine, Oct 28 '21 at 23:21
(Part 3) Now, intuition suggests that, if the lump was initially of an irregular shape, the best place to put each tiny piece would be whichever point on the surface is closest to the center of mass. Each time you do this, you’ll find that the lump is slowly getting more and more spherical. So, while this is by no means a rigorous proof, it at least shows that, if you take a convergent approach to the best case for line-picking, you’d eventually reach a sphere. And, if you think about it, this problem works the same way in 3D as it does in flatland. — Math Machine, Oct 28 '21 at 23:21
(Part 4) And, while you can’t imagine it, it follows logically the same thing should happen in 4D, and 5D, and so on. Hope that helps. Again, NOT a rigorous answer. — Math Machine, Oct 28 '21 at 23:21

Christian Blatter · Accepted Answer · 2012-07-08T18:42:35.143

(a) Two random points on the unit sphere $S^2$:

Assume the first point at the north pole ($\theta=0$) of $S^2$. Then the distance to a point at latitude $\theta$ is $2\sin{\theta\over2}$. Therefore the mean distance between the north pole and the second point is given by $${1\over 4\pi}\int_0^\pi 2\sin{\theta\over2}\cdot 2\pi\sin\theta\ d\theta={4\over3}\ .$$ $$ $$

(b) Two random points in the unit ball of ${\mathbb R}^3\ $:

Let ${\bf X}$ and ${\bf Y}$ be the two random points. Then $R:=|{\bf X}|$, $\ S:=|{\bf Y}|$, and $\Theta:=\angle({\bf X},{\bf Y})$ are independent random variables with densities $$f_R(r)=3r^2\quad (0\leq r\leq 1)\ ,\qquad f_S(s)=3s^2\quad(0\leq s\leq 1)\ ,$$ and $$f_\Theta(\theta)={1\over2}\sin\theta\quad(0\leq\theta\leq\pi)\ .$$ (Concerning $f_R$ and $f_S$ note that the volume included between $r$ and $r+dr$ is proportional to $r^2$. For $f_\Theta$ you may assume ${\bf X}$ pointing due north. The abstract surface area between $\theta$ and $\theta+d\theta$ is then proportional to $\sin\theta$, as in (a).)

It follows that the mean distance $\delta$ between ${\bf X}$ and ${\bf Y}$ is given by $$\delta=\int_0^1\int_0^1\int_0^\pi \sqrt{r^2+s^2-2rs\cos\theta}\ f_R(r) f_S(s)f_\Theta(\theta) d\theta\ ds\ dr\ .$$ The innermost integral computes to $$\eqalign{{1\over2}\int_0^\pi \sqrt{r^2+s^2-2rs\cos\theta}\ \sin\theta\ d\theta&={1\over 6rs}\bigl(r^2+s^2-2r s\cos\theta\bigr)^{3/2}\Biggr|_0^\pi \cr &={1\over 6rs}\bigl((r+s)^3-|r-s|^3\bigr)\ . \cr}$$ In the sequel we assume $s\leq r$ and compensate this by a factor of $2$. We are then left with $$\delta=\int_0^1\int_0^r 3r s(6r^2 s +2s^3)\ ds\ dr={36\over35}\ .$$ It should not be too difficult to set a similar computation up that is valid for a ball in any ${\mathbb R}^n$, $\ n\geq 2$.

Yes, @ChristianBatter, that was the value I got. Your solution is much more concise than mine, so thank you for providing it. For the similar problem of two points on a circle of radius 1 I found that the average distance is 4/pi, which makes me wonder if this upward trend continues as we continue to go up in dimensions, and whether or not there is some limit as n -> infinity. — Brian, Jul 07 '12 at 22:29
Did you find any more info on whether the upward trend indeed continues as n -> infinity? — pir, Jan 29 '24 at 10:08

score 7 · Answer 2 · edited Sep 24 '13 at 15:05

7

The average distance can be calculated as a triple integral in polar coordinates:

enter image description here

We ran into it some years back when doing research on proteins (http://www.ncbi.nlm.nih.gov/pubmed/9514112).

edited Sep 24 '13 at 15:05

user66733

7,379

answered Sep 24 '13 at 14:45

Ole Lund

71

Machinato · Answer 3 · 2020-08-28T17:01:07.103

(a) UNIT SPHERE

Let us recall the Laplace's multipole expansion formula into Legendre polynomials: $$\frac{1}{|\vec{r}'-\vec{r}|}=\frac{1}{r}\sum_{n=0}^\infty \left(\frac{r'}{r}\right)^n P_n(Y)\qquad ;r\geq r',$$

where I have denoted $Y:=\hat{r}\bullet\hat{r}'$. If we assume that $r=r'=1$, then $$\frac{1}{|\hat{r}'-\hat{r}|}=\sum_{n=0}^\infty P_n(Y)$$

The average distance between two uniformly distributed points on the unit sphere we denote as $\bar{L}$. Since $$|\vec{r}'-\vec{r}|=(\vec{r}'-\vec{r})^2/|\vec{r}'-\vec{r}|=(r'^2+r^2-2rr'Y)/|\vec{r}'-\vec{r}|,$$

we have for $r=r'=1$: $$|\hat{r}'-\hat{r}|=2(1-Y)/|\hat{r}'-\hat{r}|$$

and via Laplace's expansion therefore

$$\bar{L} = \mathbb{E}\left[|\hat{r}'-\hat{r}|\right]=\frac{1}{(4\pi)^2}\oint \oint |\hat{r}'-\hat{r}| d\Omega' d\Omega=\frac{1}{8\pi^2}\sum_{n=0}^\infty\oint \oint \left(1-Y\right) P_n(Y) d\Omega' d\Omega.$$

Due to orthogonality, we know as well that $$\oint P_n(Y)P_m(Y) d\Omega = \frac{4\pi}{2n+1} \delta_{nm}.$$

Since $1=P_0(Y)$ and $Y = P_1(Y)$, we immediately get

$$ \bar{L} = \frac{1}{8\pi^2}\sum_{n=0}^\infty\oint \oint \left(1-Y\right) P_n(Y) d\Omega' d\Omega = \frac{1}{2\pi} \oint \left(1-\frac{1}{3}\right) d\Omega = \frac{1}{2\pi}\frac{2}{3}4\pi = \frac{4}{3}$$

(b) UNIT BALL

The average distance between uniformly distrubuted two random points in the unit ball is given by$\bar{L} = \mathbb{E}\left[|\vec{r}'-\vec{r}|\right]$. Since the problem is symmetric with respect of labeling of the two points and changing their place, we can define a random variable $L' := |\vec{r}'-\vec{r}|\theta(r-r')$ which vanishes when $r'>r$. Clearly, we miss exactly half of the points, therefore

$$\bar{L} = 2\bar{L}' = 2\mathbb{E}\left[|\vec{r}'-\vec{r}|\theta(r-r')\right]=2\left(\frac{3}{4\pi}\right)^2\int \int |\vec{r}'-\vec{r}|\theta(r-r') dV' dV$$,

which can be expanded by the same trick as before. One gets $$ \bar{L} = \frac{9}{8\pi^2}\int_0^1 \int_0^r \oint \oint \left(r'^2+r^2-2rr'Y\right) \frac{1}{r} \sum_{n=0}^\infty \left(\frac{r'}{r}\right)^n P_n(Y) r^2r'^2 d\Omega' d\Omega dr' dr$$

Due to orthogonality, we get $$ \bar{L} = 18\int_0^1 \int_0^r\frac{1}{r}\left(r'^2+r^2-\frac{2}{3}rr'\frac{r'}{r}\right) r^2r'^2 dr'dr = 18\int_0^1 r^6\left(\frac{1}{5}+\frac{1}{3}-\frac{2}{3}\frac{1}{5}\right)dr = \frac{36}{35}$$

(c) BALL DISTRIBUTION FUNCTION

Denote the probablity density of $L$ as a function $g(\lambda)$, for this function one has $$g(\lambda) = \left(\frac{3}{4\pi}\right)^2\int\int \delta\left(|\vec{r}'-\vec{r}|-\lambda\right)dV'dV.$$

We expand the $\delta(L-\lambda)$ as a sum of Legendre polynomials. Note that $\delta(L-\lambda) \neq 0$ only when $|r'-r|<\lambda<r+r$ since $|\vec{r}'-\vec{r}|\in (|r'-r|,r'+r)$. Since $L = |\vec{r}'-\vec{r}|=\sqrt{r'^2 + r^2 - 2rr'Y}$ and $\delta(f(x))=\delta(x-x_0)/|f'(x_0)|$ with $f(x_0)=0$, we get

$$\delta(L-\lambda) = \frac{\sqrt{r'^2 + r^2 - 2rr' Y}}{rr'}\delta\left(Y-\frac{r'^2+r^2-\lambda^2}{2rr'}\right)=\frac{\lambda}{rr'}\delta\left(Y-\frac{r'^2+r^2-\lambda^2}{2rr'}\right).$$

Without loss of generality, we assume $r'<r$ (symmetry), $r-r'<\lambda<r+r'$ (vanishment), then

$$\delta(L-\lambda) = \sum_{n=0}^\infty A_n P_n(Y)$$

Integrating with $P_m(Y)$ over $d\Omega$ and using the orthogonality relation, we get

$$A_n = \frac{2n+1}{2} \int_{-1}^1 \delta(L - \lambda) P_n(Y) dY = \frac{2n+1}{2} \int_{-1}^1 \frac{\lambda}{rr'}\delta\left(Y-\frac{r'^2+r^2-\lambda^2}{2rr'}\right) P_n(Y) dY,$$

therefore for $|r-r'|<\lambda<r+r'$: $$\delta(L-\lambda)= \frac{\lambda}{2rr'} \sum_{n=0}^\infty (2n+1)P_n\left(\frac{r'^2+r^2-\lambda^2}{2rr'}\right)P_n(Y).$$

For the probability density therefore, with help of symmetry $$g(\lambda) = 2\lambda\left(\frac{3}{4\pi}\right)^2 \sum_{n=0}^\infty \underset{0<r-r'<\lambda<r+r'}{\int \int \oint \oint} \frac{2n+1}{2rr'} P_n\left(\frac{r'^2+r^2-\lambda^2}{2rr'}\right)P_n(Y) r'^2 r^2 d\Omega' d\Omega dr' dr$$

which is due to orthogonality (only $n=0$ survives) $$g(\lambda) = 9\lambda \int_{\lambda/2}^{1} \int_{\lambda-r}^{r}\!\! r'r\, dr' dr = \frac{9}{2}\lambda^2 \int_{\lambda/2}^{1} \!\! 2r^2\!-\!\lambda r\, dr = \frac{3}{16}\lambda^2 (2-\lambda)^2(4+\lambda)$$

score 0 · Answer 4 · answered Jul 07 '12 at 19:44

This isn't a complete answer, but a start, a rough one at that.

Given two points $p_0$ and $p_1$ find a third point $p_2$ such that $N=(p_0-p_2)\times(p_1-p_2) \neq \varnothing$ Where $\varnothing$ is the null vector. Define $n = \frac{N}{|N|}$ that is normalize the vector $N$, without normalization the distance won't be correct below.

Now we have a plane where $p$ is any point on it defined as: $$ n\cdot(p-p_2)=0 \\ n\cdot p-n\cdot p_2=0 \\ n\cdot p = n\cdot p_2\\ $$ Recall the general equation of a plane is $ax+by+cz=d$ where $d$ is the distance from the origin to the plane. Expanding out the above equation gives us $$a=n_x,b=n_y,c=n_z \\ d=n_xp_{2x}+n_yp_{2y}+n_zp_{2z} $$ Radius of the resulting circle in the plane with a sphere of radius $r$ is given as $R = \sqrt{r^2-d^2} $ You may want to negate $n$ if $d$ is negative as that will allow you to have the center of the circle at $(da,db,dc)$

We haven't defined an axes for our plane so its difficult to map from our 3D points onto it. This is where I'm stopping for now. (I might come back and finish this up) I have a feeling this approach is more of a hassle than just computing $p_0-p_1$ and calculating its length. Namely because we have to normalize a vector ($N$) just to get a distance and thus the radius.

Wiki pages I used for this: Plane and Plane-sphere intersection

It's not clear that restricting to this plane is going to help any. If the points are chosen uniformly in the original ball, they won't be uniformly distributed when we look only at the plane they define. (Intuitively, the problem is that the possible planes are "closer together" near the center at the ball, so an inifinitesimal area of each plane will be represent less of an infinitesimal volume than a similar area closer to the periphery). — hmakholm left over Monica, Jul 07 '12 at 21:30
Yeah when I set out to write this answer I thought you just needed to use the same method for a disc but in the ball. I should have noticed the probability distributions tag and that you are going to have a large number of points — Russ, Jul 07 '12 at 22:34

Mean distance between 2 points in a ball

4 Answers4

Linked