1

Let's say I have a hash table of 10 buckets where each bucket is of a number of bits. However each bucket doesn't have to have the same number of bits. Is it possible to achieve this? Or does it require padding every entry?

For example:

Index 0 has 12 bits. Index 6 has 5 bits. Index 1-5, 7-9 are empty.

Does this imply I must have an array of size 12 * 10 = 120 bits, just so that I can modulus into a specific bucket? Or is it possible to store the above in something that is much smaller, ie 17 bits?

Terence Chow
  • 121
  • 1
  • 2
  • Are you familiar with pointers and the heap? – Yuval Filmus Feb 03 '19 at 17:44
  • yea I guess I kind of knew the answer before I even asked it but I was wondering if there might be some compressed hash table data structure or some tricks to achieve such. In databases with variable length strings, I believe I learned they use a table that keeps track of each records length, although that increases the bytes by the size of a pointer anyways + the length so its suboptimal already. – Terence Chow Feb 03 '19 at 18:01
  • In typical databases, the "size" of a hash table bucket is measured in pages (disk or virtual memory). But it's not uncommon in some scenarios to compress entries at the bit level. Look up inverted index compression, for example. – Pseudonym Feb 03 '19 at 22:55
  • @TerenceChow there is a trick: if you tripe a hash table into small chunks/segments, "heap pointers" might be much smaller than 8 bytes, sometimes as small as 1 byte. – leventov Feb 06 '19 at 11:46

0 Answers0