Number of ways to write set $S$ as union of $l$ unique $k$-subsets

Question

The $k$-subsets can, of course, overlap. The answer should be in terms of $k,l$ and $|S|=n$. I have a very complicated solution involving inclusion-exclusion in mind, so I thought I'd post here first to see if anyone can think of a simpler answer.

Additionally, is it a collection of $l$ subsets, or an ordered list of $l$ subsets? Finally, do the $l$ subsets have to be distinct? — Caleb Stanford, Dec 01 '16 at 23:20
@6005 Yes, collection, yes. You can just divide/multiply by $l!$ anyway for the collection issue — Elliot Gorokhovsky, Dec 01 '16 at 23:46

Marko Riedel · Accepted Answer · 2024-01-11T21:14:37.280

Using the Polya Enumeration Theorem and the unlabeled set operator $\def\textsc#1{\dosc#1\csod} \def\dosc#1#2\csod{{\rm #1{\small #2}}}\textsc{SET}$ we obtain the generating function

$$Z(P_l)([z^k] \prod_{q=1}^n (1+z A_q)).$$

The cycle index here is evaluated with the rule

$$a_d = [z^k] \prod_{q=1}^n (1+z A_q^d).$$

Now we need to remove those terms from the generating function that are missing some of the $n$ elements represented by the $A_q,$ and this is done by inclusion-exclusion. We will subtract from the generating function those terms that have one or more elements missing, then add those with two or more and so on. The remaining terms in the generating function are set to one. For $p$ elements missing we set the corresponding variables to zero and the substitution becomes

$$a_d = [z^k] (1+z)^{n-p} = {n-p\choose k}.$$

Given that

$$Z(P_l) = [w^l] \exp\left(\sum_{d\ge 1} (-1)^{d+1} a_d \frac{w^d}{d}\right)$$

the substituted cycle index becomes

$$[w^l] \exp\left({n-p\choose k} \sum_{d\ge 1} (-1)^{d+1} \frac{w^d}{d}\right) \\ = [w^l] \exp\left({n-p\choose k} \log(1+w)\right) = [w^l] (1+w)^{n-p\choose k} \\ = \frac{1}{l!} {n-p\choose k}^{\underline{l}}.$$

Here we have chosen to expand the second binomial coefficient to make the formula easier to read. Inclusion-exclusion now yields for the answer

$$\bbox[5px,border:2px solid #00A000]{ \frac{1}{l!} \sum_{p=0}^n {n\choose p} (-1)^p {n-p\choose k}^{\underline{l}}.}$$

In terms of operators we have treated the combinatorial class

$$\textsc{SET}_{=l}(\textsc{SET}_{=k}( \mathcal{A}_1+\mathcal{A}_2+\cdots+\mathcal{A}_n)).$$

Here is the Maple code to compute these values as a means of clarifying the interpretation of the problem that was used. (Warning -- total enumeration only practicable for small configurations. The latter routine was deliberately left unoptimized to represent the problem statement before processing.)

pet_cycleind_set :=
proc(n)
option remember;
if n=0 then return 1; fi;

expand(1/n*add((-1)^(l-1)*a[l]*
               pet_cycleind_set(n-l), l=1..n));

end;
pet_varinto_cind :=
proc(poly, ind)
local subs1, subs2, polyvars, indvars, v, pot, res;
res := ind;

polyvars := indets(poly);
indvars := indets(ind);

for v in indvars do
    pot := op(1, v);

    subs1 :=
    [seq(polyvars[k]=polyvars[k]^pot,
         k=1..nops(polyvars))];

    subs2 := [v=subs(subs1, poly)];

    res := subs(subs2, res);
od;

res;

end;
X_CIND :=
proc(n, k, l)
    option remember;
    local gf, gfA, src, idx, term, res;
src := add(A[q], q=1..n);

idx := pet_cycleind_set(k);
gf := expand(pet_varinto_cind(src, idx));

idx := pet_cycleind_set(l);
gfA := expand(pet_varinto_cind(gf, idx));

res := 0;

for term in gfA do
    if nops(indets(term)) = n then
        res := res + term;
    fi;
od;

subs([seq(A[q]=1, q=1..n)], res);

end;
X :=
(n, k, l) ->
add(binomial(n,p)(-1)^pbinomial(binomial(n-p,k), l),
    p=0..n);
X2 :=
(n, k, l) ->
add(binomial(n,p)(-1)^pmul(binomial(n-p,k)-q, q=0..l-1),
    p=0..n)/l!;

Sanity check I Dec 3 2016. When $l=1$ we should get just one possibility when $k=n$ and zero otherwise. To verify this we start from

$$\sum_{p=0}^n {n\choose p} (-1)^p {n-p\choose k}$$

and observe that

$${n\choose p} {n-p\choose k} = \frac{n!}{p! k! (n-p-k)!} = {n\choose k} {n-k\choose p}$$

which yields for the sum

$${n\choose k} \sum_{p=0}^n {n-k\choose p} (-1)^p.$$

We may certainly lower the upper limit to $n-k$ as the inner binomial coefficient is zero when $n-k\lt p \le n$ and get

$${n\choose k} \sum_{p=0}^{n-k} {n-k\choose p} (-1)^p.$$

When $k=n$ this evaluates to

$${n\choose n} [z^0] (1+z)^0 = 1$$

as claimed. Furthermore when $k\lt n$ we get

$${n\choose k} (-1+1)^{n-k} = 0$$

which confirms the sanity check. We could also treat this with the Egorychev method, introducing

$${n-p\choose k} = \frac{1}{2\pi i} \int_{|z|=\epsilon} \frac{1}{z^{k+1}} (1+z)^{n-p} \; dz$$

to get for the sum

$$\frac{1}{2\pi i} \int_{|z|=\epsilon} \frac{1}{z^{k+1}} (1+z)^{n} \sum_{p=0}^n {n\choose p} (-1)^p \frac{1}{(1+z)^p} \; dz \\ = \frac{1}{2\pi i} \int_{|z|=\epsilon} \frac{1}{z^{k+1}} (1+z)^{n} \left(1-\frac{1}{1+z}\right)^n \; dz \\ = \frac{1}{2\pi i} \int_{|z|=\epsilon} \frac{1}{z^{k+1}} z^n \; dz.$$

This is $[z^k] z^n$ which is one when $k=n$ and zero otherwise.

Observe furthermore that for $k=1$ the formula produces

$$\sum_{p=0}^n {n\choose p} (-1)^p {n-p\choose l}$$

and we once more get one for $l=n$ and zero otherwise. This is because we cannot cover $n$ with singletons if there are less than $n$ of them. There is one possibility when $l=n$ (one singleton for each element of $[n]$). There are no admissible configurations when $l\gt n$ because the subsets have to be unique and there are only $n$ different ones available.

Sanity check II Dec 3 2016. We can treat the case $l=2$ where we obtain

$$\frac{1}{2}\sum_{p=0}^n {n\choose p} (-1)^p {n-p\choose k} \left({n-p\choose k} - 1\right) \\ = -\frac{1}{2} [[n=k]] + \frac{1}{2}\sum_{p=0}^n {n\choose p} (-1)^p {n-p\choose k}^2 \\ = -\frac{1}{2} [[n=k]] + \frac{1}{2} {n\choose k} \sum_{p=0}^n {n-k\choose p} (-1)^p {n-p\choose k}.$$

Lowering the upper limit to $n-k$ and using the earlier integral (both as before) we get for the sum

$$\frac{1}{2\pi i} \int_{|z|=\epsilon} \frac{1}{z^{k+1}} (1+z)^{n} \sum_{p=0}^{n-k} {n-k\choose p} (-1)^p \frac{1}{(1+z)^p} \; dz \\ = \frac{1}{2\pi i} \int_{|z|=\epsilon} \frac{1}{z^{k+1}} (1+z)^{n} \left(1-\frac{1}{1+z}\right)^{n-k} \; dz \\ = \frac{1}{2\pi i} \int_{|z|=\epsilon} \frac{1}{z^{2k-n+1}} (1+z)^{k} \; dz.$$

Therefore we have the closed form

$$-\frac{1}{2} [[n=k]] + \frac{1}{2} {n\choose k} {k\choose 2k-n}.$$

Counting these from combinatorial principles suppose $q$ elements are common to both sets where $0\le q\le k$, this means that $q+2(k-q)=n$ or $q=2k-n$ for a contribution of

$$\frac{1}{2} {n\choose q} {n-q\choose k-q} = \frac{1}{2} {n\choose 2k-n} {2n-2k\choose n-k}.$$

Note however that this produces $\frac{1}{2} {n\choose n} [z^0] (1+z)^0 = \frac{1}{2}$ when $n=k$ while the correct result is zero (no possibility for two unique sets of $n$ elements as there is only one available or alternatively we must have $q\lt n$ because the two sets are unique) so we finally get

$$-\frac{1}{2} [[n=k]] + \frac{1}{2} {n\choose 2k-n} {2n-2k\choose n-k}.$$

To verify that this matches the result from the integral we write

$${n\choose 2k-n} {2n-2k\choose n-k} = \frac{n!}{(2k-n)!(n-k)!(n-k)!} = {n\choose k} {k\choose 2k-n}.$$

DAMMMMM this is the kind of stuff I need to learn how to do. Is there any way to get that sum into a closed form, though? — Elliot Gorokhovsky, Dec 02 '16 at 03:01
Also: I'm really interested in getting a lower bound on the sum. Is it possible to substitute the factorials with Stirling's approximation and then integrate? How would one ensure the integrals are a lower bound on the sum? Or would it even be possible to sum directly? — Elliot Gorokhovsky, Dec 05 '16 at 17:48

Alexander Burstein · Answer 2 · 2023-12-22T06:07:11.660

I don't think an Inclusion-Exclusion solution is so complicated. In fact, it is pretty straightforward.

For each element $x\in S$, define property $P_x$ to be "$x$ does not belong to any subset in a given $l$-collection of $k$-subsets". We want to find the enumerate the number $e_0$ of $l$-collections of $l$ $k$-subsets that satisfy exactly $0$ properties in $\Omega=\{P_x\mid x\in S\}$.

Given an arbitrary subset $T$ of $\Omega$, let $N(\supseteq T)$ be the number of $l$-collections of $k$-subsets that satisfy at least all the properties in $T$. Let $S_T=\{x\mid P_x\notin T\}$, then all $k$-subsets that satisfy at least all properties in $T$ are subsets of $S_T$. The number of $k$-subsets of $S_T$ is $$ \binom{|S_T|}{k}=\binom{n-|T|}{k}, $$ so the number of $l$-collections of such subsets is $$ \binom{\binom{n-|T|}{k}}{l}. $$ Let $$ N_j=\sum_{\substack{T\subseteq\Omega\\|T|=j}} N(\supseteq T). $$ There are $\binom{n}{j}$ subsets $T\in\Omega$ of cardinality $j$, so $$ N_j=\binom{n}{j}\binom{\binom{n-j}{k}}{l}. $$ But then the formula (4.2.6) on page 112 of generatingfunctionology says that the number we want is $$ e_0=\sum_{j\ge 0}(-1)^j N_j = \sum_{j\ge 0}(-1)^j \binom{n}{j}\binom{\binom{n-j}{k}}{l} =\sum_{j=0}^{n}(-1)^j \binom{n}{j}\binom{\binom{n-j}{k}}{l}. $$

Number of ways to write set $S$ as union of $l$ unique $k$-subsets

2 Answers2

Linked