Why Is KD-Tree-based Nearest Neighbor Exponential in K?

Question

I've read in many papers on higher-dimensional nearest neighbor search that KD-Trees are exponential in K, but I can't seem to determine why.

What I'm looking for is a solid runtime-complexity analysis which explains this aspect of the problem.

Quick thought is that k is effectively the dimension of the problem and so it suffers from the "curse of dimensionality." — Michael Klein, Feb 21 '16 at 06:36

TilmannZ · Answer 1 · 2016-02-23T08:17:49.327

kNN tends to be exponential because the search space increases with $2^k$. Imagine you partition the space around your search point into quadrants. For k=1 you just have to search two 'quadrants' (higher and lower values), for k=2 it's 4 quadrants, for k=3 it's 8 quadrants, i.e. exponential growth of search space. That is what the kD-tree suffers from, because it has to search $2^k$ sub-branches.

Other trees perform much better, for example the CoverTree. I also found that the PH-Tree works quite well, it seems to consistently take twice as long as the CoverTree for datasets between k=8 and k=27 (I didn't have datasets with higher k).

Note that you can use LaTeX here to typeset mathematics in a more readable way. See here for a short introduction. — Raphael, Feb 22 '16 at 13:41

Why Is KD-Tree-based Nearest Neighbor Exponential in K?

1 Answers1

Linked