1

I am currently self-studying Elements of Statistical Learning (2nd ed), by Friedman, Hastie & Tibshirani. I have a question with regards to Equation 2.24, which states the median distance of the closest point from the origin in a $p$-dimensional unit hyperball is

$d(p, N) = (1 - \frac{1}{2}^{1/N})^{1/p}$

where $N$ is the number of points in this hyperball.

I understand the derivation of the above (a good discussion of it can be found here); my question is:

if the distribution of points in this hyperball is uniform, shouldn't the points be arranged in a deterministic way? If so, does this not necessitate that the location of the closest point to the origin is at a fixed distance away from it?

I think this way because when I think of points being "uniformly-distributed" in lower dimensions, I think of there being a constant, common difference between neighbors (e.g. consider points in a lattice structure in three dimensions, or the points at the intersection of a square grid in two). I would think this thinking scales to $p$ dimensions, meaning there should be no randomness involved in the points' arrangement.

Edit: one possibility of the connotation of "uniformly-distributed" in this case would be: each point has an equal probability of residing at any location in the hyperball, independently of other points. If this connotation is correct, I will update it as an answer below -- I am just not used to seeing things expressed this way.

joriki
  • 238,052
Nurmister
  • 335

2 Answers2

2

Indeed the term means what you suspected in the last paragraph in your edit to the question.

See https://en.wikipedia.org/wiki/Uniform_distribution_(continuous).

joriki
  • 238,052
  • Thank you for the response, and link. I have added my edit as an answer below. Out of curiosity, would one refer to my aforementioned "deterministic arrangement" as an arrangement of the sample points such that there is a common, constant density of them throughout the volume of the hyperball? – Nurmister Aug 19 '18 at 12:56
  • @Nurmister: No, I wouldn't call that a "density", that usually refers to something continuous, not to a discrete set of points. I'd call that a regular grid, or more formally, if it has the required properties, a lattice (see https://en.wikipedia.org/wiki/Lattice_(discrete_subgroup), https://en.wikipedia.org/wiki/Bravais_lattice). You might also find these relevant: https://math.stackexchange.com/questions/64235, https://math.stackexchange.com/questions/425782, https://en.wikipedia.org/wiki/Quasi-Monte_Carlo_method, https://en.wikipedia.org/wiki/Low-discrepancy_sequence. – joriki Aug 19 '18 at 13:22
  • Ah, I think I meant a uniform grid of points which would render a constant, common sampling density through the volume of the hyperball. Thanks again. – Nurmister Aug 20 '18 at 12:37
1

As mentioned by joriki, what was meant by "uniform distribution of points" was that each point is equally likely to be at any point in the hyperball, independent of other points. It does not necessarily mean that the points are arranged such that their sampling density (in other words, what I understand to be a regular grid as joriki has put it) is constant throughout the hyperball.

The fact was used in the derivation of this equation as found here: (in my own words) the probability that all $N$ points are at least distance $r$ away from the origin of the hyperball is: $(\frac{\text{Volume of unit hyperball} - \text{volume of sub-hyperball with radius } r}{\text{Volume of unit hyperball}})^N$

Nurmister
  • 335