2

I'm considering a hash function for a tuple of and maximum-valued integers for example (3 ,2, 5 ) -- where each element of the tuple has some maximum value N.

My plan is to treat the elements of the tuple as the digits of a number in base N, so that in the trivial case of N = 10 the tuple(3 ,2, 5 ) hashes to 325, or 213 if N = 8.

(1) Is this hashing technique flawed and (2) If not, does it have a name, and (3) Are there better ways to hash a tuple of integers.

Olumide
  • 251

1 Answers1

3

While this will work and in some situations will give good results, there are two dangers with this approach:

  • considering a tuple with M values where reach has maximum N, your calculation of the hash value will overflow if M*N > your integer type limit. If you have either large N or M this is an issue that you can rectify by using a smaller multiplier. This will, however, perform worse for smaller N and M, so you should pick your approach depending on how likely overflow is.
  • if you are using the hash codes for storage in a hash table, note that your hash code will be taken modulo the size of the table. This means that if your N is a factor of the table size and your individual values are not evenly distributed, you are more likely to see collisions. Ensuring your multiplier is coprime with the table size is the best approach; the easy ways of doing this are to either pick a prime table size, or a prime multiplier I in your hash function. As most modern hah table implementations automatically choose sizes that are not typically prime, you should usually use primes in your hash function. One approach would be to have a list of primes and pick the smallest >= N. I think this would solve this issue.
Jules
  • 17,754