2

What is the probability that 3 or more people share a common birthday, in a group of 160 people?

Approach:

We have:

$P(X\geq 3)= 1-[P(X=0)+P(X=2)]$.

(where $X\geq i$ means at least $i$ people share a common birthday, whereas $X=i$ means exactly $i$ people share a common birthday, while considering the possibility of multiple groups of i people sharing a common birthday, for eg the case involving birthdates A A B B C C D E F G is counted in $X=2$, A A B B B C C D E F is not.)

The cases for $X=0$ and $X=2$ can be dealt together by using the following argument:

Consider $x$ groups of $2$ people each, i.e. $2x$ people put into $x$ groups of $2$. There are ${160 \choose 2x}(2x)!/(2!)^x$ ways to create such groups. Now, we just have to select $x$ dates out of $365$ and assign it to these groups, so we have ${365\choose x}x!$ ways of doing this. We then assign dates to the remaining $160-2x$ people, and there are $365-x$ dates left. So we have ${365-x \choose 160-2x}(160-2x)!$ ways of doing it.

The sample space is $365^{160}$,and $x$ can run from $0$ to $80$, ($x=0$ corresponds to the case where everyone has a different birthday), so it appears to me that:

$$P(X=0)+P(X=2)=\sum_{x=0}^{80} \dfrac{{160 \choose 2x}(2x)!/(2!)^x*{365\choose x}x!*{365-x \choose 160-2x}(160-2x)!}{365^{160}}$$

However, it seems that something has gone seriously wrong with my reasoning, as Wolfram estimates this sum to be about $10^{61}$...

This expression does give the correct values for the trivial cases $x=0$ and $x=1$...and there doesn't seem to be any occurrence of double counting with this kind of approach..

What is possibly going wrong then?

Edit: I have seen multiple variations of question, and most of them use the poisson's formula to arrive at a numerical estimate....However more than finding the correct numerical answer I wish to know the flaw in this particular approach of mine.

RobPratt
  • 45,619
satan 29
  • 986
  • The "definition" of $X$ in the OP makes no sense. What does it mean "no. of people sharing a common birthday"? In case we have only $6$ people, named $1,2,3,4,5,6$, with birthdays respectively at the days $A,A,A,B,B,C$ what is $X$ for this special case? Why not computing the complementary probability, which should be fairly easy?! – dan_fulea Jul 23 '21 at 13:25
  • edited the post regarding the treatment of X, but regarding the second part: Thats precisely what I am doing, no? Computing the complimentary probability, which I describe as $P(X=0)+P(X=2)$... – satan 29 Jul 23 '21 at 13:34
  • here is the same (underlying) question for a different number of people. – lulu Jul 23 '21 at 13:37
  • I just wish to know what's wrong with my particular approach... – satan 29 Jul 23 '21 at 13:39
  • I will try to understand what is wrong, but please use well defined objects. A random variable has a value for a given "element" in the space of possibilities, so which is the value of $X$ for the case when the 10 people have the birthdays at the days $A,A,A,A;B,B,B,C,C,D$? Is here $X=3$? Or $X=4$? Or both values are allowed? How can it be that $X=0$? Using random variables is misleading, why not use two (or more) events, in the event $A_0$ there are no two people having the same birthday (bd), in $A_2$ there is exactly one pair of people with the same bd, in $A_4$ exactly two pairs... – dan_fulea Jul 23 '21 at 13:51
  • The examples that you are giving cannot be classified into some X values, however the point is simply that I used that notation just for the two particular cases of 0 and 2. I genuinely do understand your, perfectly valid concern and Ill be sure to be a bit more careful next time, however I do believe the only relevant cases for this particular problem are covered satisfactorily by my explanation. My usage of X was to just somehow formulate the approach without the use of much words. I do agree with you that in general, the way I defined X makes it a "non well defined character". – satan 29 Jul 23 '21 at 14:00
  • 1
    @dan_fulea Define $X$ to be the largest number of people who share a birthday, and then it is a proper random variable. In your example, $X=4$. Albeit, for this to make sense, the case where all birthdays are different would correspond to $X=1$, but this is just a cosmetic difference. – Mike Earnest Jul 23 '21 at 14:01

1 Answers1

2

There are ${160 \choose 2x}(2x)!/(2!)^x$ ways to choose an ordered sequence of $x$ groups of $2$. But you don't care about the order: for you, the events

  • "A and B have the same birthday, C and D have another same birthday, and the rest have unique birthdays"
  • "C and D have the same birthday, A and B have another same birthday, and the rest have unique birthdays"

are the same. So you should instead have $\binom{160}{2x} \cdot \frac{(2x)!}{x! (2!)^x}$ here, dividing by $x!$.

Misha Lavrov
  • 142,276
  • Well, we do not have a well defined question, but we have an answer... (They have downvoted a lot of my answers for this reason...) – dan_fulea Jul 23 '21 at 13:52
  • The question is rather well defined actually, if you can read my last comment. – satan 29 Jul 23 '21 at 14:23
  • @satan29 it is not correct that you do not care about the order. You do care about the order because your denominator orders them! The mistake is that you ordered them twice. First by ordering them when you chose $x$ pairs of people with each pair sharing a different birthday and you ordered them again when you chose $x$ birthdays out of $365$ as you multiplied by $x!$. – Math Lover Jul 23 '21 at 14:32
  • @MathLover Was that meant to be directed at me? Anyway, what I mean is that we don't care about the order when counting the number of events for each $x$. When computing the probability of each event, you are right that there is an order. – Misha Lavrov Jul 23 '21 at 14:54
  • @MishaLavrov no it was not directed at you otherwise I would have addressed to you :) I figured what you meant. I was just trying to clarify to OP that they ended up ordering twice and they should not interpret your answer as they did not need to order at all. – Math Lover Jul 23 '21 at 14:59