2

Say I have a large number of sets (on the order of ~1000) with a smaller number of potential entries (~200), and a widely varying number of entries per set. An example:

$s_1 = \{1, 42, 133\}$

$s_2 = \{27, 283, 292, 172, 66, 62\}$

$s_3 = \{1, 42, 292, 66\}$

$...$

$s_{1000} = \{1, 133, 72\}$

Is there an algorithm more efficient than Monte-Carlo / brute force to find the minimum set of sets that I need to intersect to get a result containing exactly one specific item?

For example, given the above four sets, I would need to intersect $s_1, s_3,$ and $s_{1000}$ to arrive at a result containing only the element $1$ - the algorithm would need to find this solution (or a different solution requiring the same or a smaller number of sets), for arbitrary elements appearing in at least one set.

I have a feeling that this could be transformed into a set packing problem (which would make it an NP problem), but I am not an expert on this topic. Any input would be appreciated.

malexmave
  • 985
  • 1
  • 7
  • 9

1 Answers1

2

Your problem (or rather, its decision version: are there $k$ sets whose intersection is a singleton) is NP-complete, so I doubt there is a simple solution that always works, even for your numbers.

The problem is clearly in NP. We show that it is NP-hard by reduction from SAT. Let $\varphi$ be a CNF on variables $x_1,\ldots,x_n$ and clauses $C_1,\ldots,C_m$. We form an instance of your problem on the universe $x_1,\ldots,x_n,C_1,\ldots,C_m,\Delta$. For every $x_i$ and every truth value $t$, there is a set $S_i^t$ which contains (1) $\{x_1,\ldots,x_n\} \setminus \{x_i\}$, (2) all clauses not satisfied by $x_i=t$, and (3) $\Delta$.

All sets contain $\Delta$. In order to get rid of all $x_i$, we need to choose at least one set $S_i^t$ for each $i$. The intersection consists only of $\Delta$ if and only if the choice corresponds to a satisfying assignment. Hence there are $n$ sets whose intersection is a singleton iff $\varphi$ is satisfiable.

Yuval Filmus
  • 276,994
  • 27
  • 311
  • 503