3

I have made a Ruby program, which enumerates the possible multiset partitions, into a given number of disjoint sets (N), also called bins. The bins are indistinguishable. They can be sorted in any order, and have no distinctive markings. Furhermore every bins must hold at least 1 element.

My program uses simple backtracking to count all the possible cases, and a "trie" data structure to store the already found partitionings (to avoid cycles in a graph, and infinite cycles).

Now I would like to verify, that my program provides correct results. For example when I have 11 elements, and 7 - 2 - 2 elements are indistinguishable, they are marked as 1., 2., and 3., so I have have three types of elements, and alltogether 11 pieces of them (1, 1, 1, 1, 1, 1, 1, 2, 2, 3, 3), which should be sorted out into 4 disjoint set, or we can call it bins too. In this case my program provided these list: https://pastebin.com/UJFD8WPY

Which is 358 possible partitioning of the multiset, for those 11 not unique elements, which contains repeated elements too, to form a multiset.

In the case, when there are 15 elements: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 2, 2, then the possible partitionings were: 1931

For the set "1, 1, 2, 3, 3, 3, 3, 3, 3, 4, 5, 6" into 4 bins I got: 6946 possible partitioning.

How could I verify these results?

I encountered this problem when I wanted to sort images into rows, but in an optimal way: every image should show the same area. It is not possible of course, as image characteristics, like aspect ratios differ, but minimizing the variance of the areas is possible.

I think this problem is equivalent of integer factorization (called multiplicative partitions or "factorisatio numerorum" https://en.wikipedia.org/wiki/Multiplicative_partition ): the number of possible factorization of an integer to their prime components, which also can have their multiplicities of course, and the product consist of N coefficients. For example an integer of the form ppqqr can be decomposed into two factor products: p*(pqr), pp(qqr), etc.

  • Are the bins distinguishable? That is, is putting all the elements in the first bin and leaving the others empty, the same as putting all the elements in the second bin, or different? – saulspatz Jul 29 '18 at 21:29
  • The bins are indistinguishable. No mark for the bins. They are exchangeable. They can be sorted in any order. – Konstantin Jul 29 '18 at 21:33
  • And every bins must hold at least 1 element. I encountered this problem when I wanted to sort images into rows, but in an optimal way: every image should show the same area. It is not possible of course, but minimizing the variance of the areas is possible. – Konstantin Jul 29 '18 at 21:59
  • 1
    You should add any clarifications to the body of the question, not the comments. Also, you should describe the problem you are really trying to solve. Apparently there is much more to it than counting multiset partitions. – saulspatz Jul 29 '18 at 22:03
  • 1
    Observe that the pastebin data contains a duplicate entry, namely "1.1.|1.1.1.|1.1.2.|2.3.3." – Marko Riedel Jul 30 '18 at 21:59

2 Answers2

1

The number of partitions of a set of size $n$ is called the Bell number of $n$ and is denoted $B_n$. See e.g. https://en.m.wikipedia.org/wiki/Bell_number for more.

The notion can be generalised to multisets. See https://www.sciencedirect.com/science/article/pii/0012365X74900764

Also see Finding the amount partitions of a multiset

Hans Hüttel
  • 4,271
0

Suppose we start with the source multiset

$$\def\textsc#1{\dosc#1\csod} \def\dosc#1#2\csod{{\rm #1{\small #2}}} \prod_{k=1}^l A_k^{\tau_k}$$

where we have $l$ different values and their multiplicities are the $\tau_k.$

If we have a CAS like Maple, $N$ is reasonable and we seek fairly instant confirmation of these values then we may just use the cycle index $Z(S_N)$ of the symmetric group which implements the unlabeled operator $\textsc{MSET}_{=N}.$ This yields the formula

$$\left[\prod_{k=1}^l A_k^{\tau_k}\right] Z\left(S_N; -1 + \prod_{k=1}^l \frac{1}{1-A_k}\right).$$

Maple can extract these coefficients by asking for the coefficient of the corresponding Taylor series. This can be optimized. We get the following transcript:

> MSETS([7,2,2], 4);                                                  
                                 357

> MSETS([10,3,2], 4);                                                 
                                 1930

> MSETS([2,1,6,1,1,1], 4);                                            
                                 6945

> MSETS([7,2,2], 7);                                                  
                                 129

> MSETS([10,3,2], 7);                                                 
                                 2351

> MSETS([2,1,6,1,1,1], 7);                                            
                                 3974

> map(el->el[1], select(el->el[2]>0,\                                 
>  [seq([n, FACTORS(n,4)], n=1..256)]));

[16, 24, 32, 36, 40, 48, 54, 56, 60, 64, 72, 80, 81, 84, 88, 90, 96,

    100, 104, 108, 112, 120, 126, 128, 132, 135, 136, 140, 144, 150,

    152, 156, 160, 162, 168, 176, 180, 184, 189, 192, 196, 198, 200,

    204, 208, 210, 216, 220, 224, 225, 228, 232, 234, 240, 243, 248,

    250, 252, 256]

The sequence is OEIS A033987 and looks to have the right values. The Maple code here is quite simple.

with(combinat);
with(numtheory);

pet_cycleind_symm :=
proc(n)
option remember;

    if n=0 then return 1; fi;

    expand(1/n*
           add(a[l]*pet_cycleind_symm(n-l), l=1..n));
end;

pet_varinto_cind :=
proc(poly, ind)
local subs1, subs2, polyvars, indvars, v, pot, res;

    res := ind;

    polyvars := indets(poly);
    indvars := indets(ind);

    for v in indvars do
        pot := op(1, v);

        subs1 :=
        [seq(polyvars[k]=polyvars[k]^pot,
             k=1..nops(polyvars))];

        subs2 := [v=subs(subs1, poly)];

        res := subs(subs2, res);
    od;

    res;
end;


MSETS :=
proc(src, N)
local msetgf, cind, gf, cf;

    msetgf := mul(1/(1-A[q]), q=1..nops(src))-1;
    cind := pet_cycleind_symm(N);

    gf := pet_varinto_cind(msetgf, cind);

    for cf to nops(src) do
        gf := coeftayl(gf, A[cf] = 0, src[cf]);
    od;

    gf;
end;

FACTORS :=
proc(n, N)
local mults;

    mults := map(el -> el[2], op(2, ifactors(n)));
    MSETS(mults, N);
end;
Marko Riedel
  • 61,317