1

In a brute force nearest neighbour: (consider point in question is 'x', and total number of points is n)

  • calculate distance between x and every other point O(n)
  • Compare all these distances to get the minimum O(n)

With LSH:

  • calculate hash value for each point and construct a hash map O(n)
  • Calculate hash value of x: h, lookup other points stored against h in the hash map O(1)

Overall, it seems it is still O(n). What are the factors that are really making LSH more efficient? I could think of following two:

  • the first step in both of the approaches is done using matrix multiplications and because of parallelisation, they aren't really O(n) but more efficient than that. So the bottleneck becomes the second step which is constant time with LSH.
  • The first step of LSH is not concerned with the point in question x and hence can be precomputed and stored once. Again, with precomputation, the approx. the algorithm becomes only constant time and hence more efficient.

Are there any other factors that make the LSH nearest neighbour efficient?

And also, more importantly, Is my understanding correct? Or is there something very simple that I am missing here?

0 Answers0