1

I have no maths background, but I am looking for an answer for the following problem for work-related purposes.

  1. We have a certain set of actors $A$ and their number is always $n(A)>0$.
  2. We have a certain set of enemies $E$ and their number is always $n(E)>0$.
  3. Each actor and each enemy may have a flag DO_NOT_PAIR. The probability whether they have a flag or not is not specified. We do know in each case how many actors $A$ or enemies $E$ have the flag.
  4. Each actor $A$ randomly chooses its target from a set of enemies $E$.
  5. Each enemy $E$ randomly choses its target from a set of actors $A$.
  6. For each actor $A$ and for each enemy $E$ we check whether its target has a flag DO_NOT_PAIR. If anyone paired up with a target with a flag, we jump back to step 4 and repeat the algorithm. If no one is paired up with a target with a flag, we finish the algorithm.

The question is:
In certain situations it will be mathematically impossible to pair everyone correctly (for example when all actors or all enemies have a flag). How many jumps from step 6 we must perform in order to ensure that we checked at least $p\%$ of all possible pairings (or if that's not possible to figure out: that we have $c\%$ certainty that we checked $p\%$ of all possible pairings?).

I hope I phrased the problem clearly. English is not my first language and I have at best sub-par knowledge on maths terminology. Thanks in advance.

  • Do you really mean "pairs", or do you mean "pairings"? I ask because I don't see why you'd want to know how many pairs you've checked, whereas it might be useful to know which fraction of the pairings you've checked. – joriki Apr 01 '20 at 14:25
  • Hi. I meant pairings. Edited the first post. – user2551153 Apr 01 '20 at 18:06

1 Answers1

0

It’s not clear to me why you perform this seemingly wasteful algorithm at all, instead of directly choosing targets only among the potential targets that don’t have the DO_NOT_PAIR flag, but in case you need to:

There are $n(A)$ independent choices with $n(E)$ options each and $n(E)$ independent choices with $n(E)$ options, for a total of $n(A)^{n(E)}n(E)^{n(A)}$ different pairings. (This will typically be a very large number.)

You can’t guarantee that you’ve seen a certain proportion of different pairings after some number of trials because you might always get the same pairing.

The probability distribution of the number of different pairings seen after $n$ trials is that of the coupon collector’s problem. This is given at Probability distribution in the coupon collector's problem. It’s a complicated sum with lots of summands and lots of cancellation for the large numbers that you may be dealing with, so you may want to approximate it, but to provide a useful approximation we’d have to know which regime you’ll be using it in.

joriki
  • 238,052
  • Thank you for taking the time to review my problem. It's not clear to me why it was set up this way, it's kind of legacy thing. This is not an IT problem, it's the old manufacturing machinery problem. I'm trying to judge how reliable the process is and how many attempts is necessary to guarantee a certain treshold of success. I'm not sure what you mean by regime. The numbers are not that large, I should have included them in the first post. Most of the time, $n(A)$ is between 1 and 4, while $n(E)$ is between 1 and 8, and usually $n(A)<n(E)$. I don't know what other information is required. – user2551153 Apr 01 '20 at 19:20
  • @user2551153: By "regime" I mean the region in which the parameters lie. For instance, the Wikipedia section on approximations for binomial coefficients has subsections titled "Both $n$ and $k$ large" and "$n$ much larger than $k$"; these are two different "regimes" in which different approximations are applicable. Even for $n(A)=n$, $n(E)=8$, there will be $4^8\cdot8^4=2^{28}\approx3\cdot10^9$ different pairings. The further information required would be the percentages $c$ and $p$ in the question. – joriki Apr 01 '20 at 19:30
  • @user2551153: I'm wondering now whether you really do mean pairings or whether I led you on a wrong track with that comment (considering that you mentioned that English is not your first language). By "pairing", I mean the entire result of the process, the entire assignment of actors to enemy targets and enemies to actor targets. There are $n(A)^{n(E)}n(E)^{n(A)}$ different ones of these overall configurations. If you just want to know which fraction of the possible pairs of actor + enemy target and enemy + actor target you've tried out, that would involve much smaller numbers. – joriki Apr 01 '20 at 19:33
  • As for the assumptions: $n(A)\in{1..4}$, $n(E)\in{1..8}$, $n(A)<n(E)$, $c=75%$, $p=90%$. I reviewed the pair vs. pairings, and I still think I need pairings. We're aiming to judge the whole setup, so to speak. – user2551153 Apr 02 '20 at 12:05