2

I'm trying to understand the DHP(Direct Hashing and Pruning) algorithm and I got stuck at explaining the selection of modulo number. The paper shows an example of using the hash function at page 7: h{{x y}) = ((order of x)*10 + (order of y)) mod 7

My questions are:

  1. What are the basis to define the function this way?
  2. How is the modulo number selected(7, in this example)?

enter image description here

flamenco
  • 121
  • 3

1 Answers1

0
  1. One possible reason to define the function this way is that it works for their example. It is an academic paper so it incentivized to find a novel solution that works on a specific dataset. In the paper, it is a synthetic dataset.

  2. When building hash tables it is best practice to pick a prime (7 in this case) to minimize the number of collisions when using the modulo operator. The fact that is small prime is a clue that they building a toy system.

Brian Spiering
  • 21,136
  • 2
  • 26
  • 109