1

Is there a good data structure for storing overlapping sets? Consider having multiple sets which can overlap in various ways and would like to store them in the memory and access efficient way.

Example:
A = {a, b, c, d}
B = {a, b, c}
C = {b, c, d}
D = {a, b, d, e, f}

Storage:
W = {e}
X = {b, c}
Y = {d}
Z = {a}

B = Z union X
A = B union Y
C = X union Y
D = A union W
  • If the universe is 64 items or less, it's convenient and very fast to use bits in an integer; you can even extract the "first item" (lowest-order 1-bit) in constant time using x - (x & (x - 1)). In most other cases, a hashtable/dictionary is most convenient, though sorted arrays/lists enable intersection and union in linear time via list merge and have lower space overhead. – j_random_hacker Feb 29 '20 at 23:06
  • The universe is much much larger. Consider 100 thousand items, 20-50 thousand sets wich 2-1000 items.

    A lot of these sets are just a different set with a few (even one) additional items.

    My end goal is to store these sets in the smallest possible memory.

    – Alexander Weps Feb 29 '20 at 23:19
  • For small size, look into succinct data structures. – j_random_hacker Mar 01 '20 at 01:08
  • 2
    Please edit your question to define what operations you want to perform, what criteria you want to use to evaluate efficiency, and what approaches you've already considered and why you've rejected them. – D.W. Mar 01 '20 at 01:32

0 Answers0