2

I am reading a book and they state that "in a group of 23 people, the probability is 50.7% that two people share the same birthday"

How can this be?

Could somebody point to me how to calculate the 50.7% figure?

Many thanks in advance

  • 1
    https://en.wikipedia.org/wiki/Birthday_problem – vadim123 Nov 23 '15 at 18:20
  • 2
    Incidentally, it can be said, somewhat less ambiguously, that in a group of 23 people, it is slightly less likely than 50-50 that no two people share the same birthday. – Brian Tung Nov 23 '15 at 18:22
  • A common misconception is that you think about the probability of someone else sharing "my birthday." That probability is quite low. Instead it is the probability that two "random" people share the same birthday. – Michael Burr Nov 23 '15 at 18:23
  • 1
    @MichaelBurr: I know what you mean, and I think anyone familiar with the birthday paradox will know what you mean. But it is likely others will interpret "the probability that two 'random' people share the same birthday" as "the probability that two specific people, chosen at random, share the same birthday," which is $1/365$ (more or less). What we want to say, I think, is something along the lines of "the probability that, in this group of $23$ (or whatever number) people, there are (at least) two people that share the same birthday." It's not as pithy, but it's also less ambiguous. – Brian Tung Nov 23 '15 at 21:07

2 Answers2

1

HINT & Possible Explanation.

Comparing the birthday of the first person on the list to the others allows $22$ chances for a matching birthday, the second person on the list to the others allows $21$ chances for a matching birthday (in fact the second person also has total $22$ chances of matching birthday with the others but his chance of matching birthday with the first person, one chance, has already been counted with the first person's $22$ chances and shall not be duplicated), third person has $20$ chances, and so on.
Hence total chances are: $$22+21+20+ \cdots +1 = 253$$

so comparing every person to all of the others allows $253$ distinct chances (combinations): in a group of $23$ people there are $$\binom{23}{2} = \frac{23 \cdot 22}{2} = 253$$

distinct possible combinations of pairing.

Presuming all birthdays are equally probable, the probability of a given birthday for a person chosen from the entire population at random is $\frac{1}{365}$ (ignoring February $29$).
Although the number of pairings in a group of $23$ people is not statistically equivalent to $253$ pairs chosen independently, the birthday problem becomes less surprising if a group is thought of in terms of the number of possible pairs, rather than as the number of individuals.

CALCULATION OF PROBABILITY

The problem is to compute the approximate probability that in a group of n people, at least two have the same birthday. For simplicity, disregard variations in the distribution, such as leap years, twins, seasonal or weekday variations, and assume that the $365$ possible birthdays are equally likely. Real-life birthday distributions are not uniform since not all dates are equally likely.

The goal is to compute $P(A)$, the probability that at least two people in the room have the same birthday. However, it is simpler to calculate $P(A')$, the probability that no two people in the room have the same birthday. Then, because $A$ and $A'$ are the only two possibilities and are also mutually exclusive $$P(A) = 1 − P(A')$$

In deference to widely published solutions concluding that $23$ is the minimum number of people necessary to have a $P(A)$ that is greater than $50$%, the following calculation of $P(A)$ will use $23$ people as an example.

When events are independent of each other, the probability of all of the events occurring is equal to a product of the probabilities of each of the events occurring. Therefore, if $P(A')$ can be described as $23$ independent events, $P(A')$ could be calculated as $P(1) × P(2) × P(3) × ... × P(23)$.

The $23$ independent events correspond to the $23$ people, and can be defined in order. Each event can be defined as the corresponding person not sharing his birthday with any of the previously analyzed people. For Event $1$, there are no previously analyzed people. Therefore, the probability, $P(1)$, that Person $1$ does not share his birthday with previously analyzed people is $1$, or $100$%. Ignoring leap years for this analysis, the probability of $1$ can also be written as $$\frac{365}{365}$$ for reasons that will become clear below.

For Event $2$, the only previously analyzed people are Person $1$. Assuming that birthdays are equally likely to happen on each of the $365$ days of the year, the probability, $P(2)$, that Person $2$ has a different birthday than Person $1$ is $$\frac{364}{365}$$

This is because, if Person $2$ was born on any of the other $364$ days of the year, Persons $1$ and $2$ will not share the same birthday.

Similarly, if Person $3$ is born on any of the $363$ days of the year other than the birthdays of Persons $1$ and $2$, Person $3$ will not share their birthday. This makes the probability $$P(3) = \frac{363}{365}$$

This analysis continues until Person $23$ is reached, whose probability of not sharing his/her birthday with people analyzed before $$P(23) = \frac{343}{365}$$

$P(A')$ is equal to the product of these individual probabilities:

$$P(A') = \frac{365}{365} \times\frac{364}{365}\times\frac{363}{365}\times\frac{362}{365}\times\cdots\times\frac{343}{365}$$

The terms of equation above can be collected to arrive at:

$$P(A')=\left(\frac{1}{365}\right)^{23}\times(365\times364\times363\times\cdots\times343)$$

Evaluating equation gives $$P(A') ≈ 0.492703$$

Therefore $$P(A) ≈ 1 − 0.492703 = 0.507297$$

That is $50.7297$%

This process can be generalized to a group of $n$ people, where $p(n)$ is the probability of at least two of the $n$ people sharing a birthday. It is easier to first calculate the probability $p(n)$ that all $n$ birthdays are different. According to the pigeonhole principle, $p(n)$ is zero when $n > 365$. When $n ≤ 365$ we have:

$$ \begin{align} \bar p(n) &= 1 \times \left(1-\frac{1}{365}\right) \times \left(1-\frac{2}{365}\right) \times \cdots \times \left(1-\frac{n-1}{365}\right) \\ &= { 365 \times 364 \times \cdots \times (365-n+1) \over 365^n } \\ &= { 365! \over 365^n (365-n)!} = \frac{n!\cdot{365 \choose n}}{365^n} = \frac{_{365}P_n}{365^n}\end{align} $$

The equation expresses the fact that the first person has no one to share a birthday, the second person cannot have the same birthday as the first, the third cannot have the same birthday as either of the first two, and in general the $n$-th birthday cannot be the same as any of the $ n − 1 $ preceding birthdays.

The event of at least two of the $n$ persons having the same birthday is complementary to all $n$ birthdays being different. Therefore, its probability $p(n)$ is

$$p(n) = 1 - \bar p(n)$$

This probability surpasses the value $\frac{1}{2}$ for $n = 23$ (with value about $50.7$% as we saw before).

Enrico M.
  • 26,114
  • 1
    I wrote here the source from Wikipedia, ONLY BECAUSE I AM ALSO the author of that page. – Enrico M. Nov 23 '15 at 18:33
  • Very clear. Thanks for taking the time to answer – Luj Reyes Nov 23 '15 at 18:35
  • @LujReyes No problem! As I said, I took that from my Wikipedia page :( You can easily find more on the page "Birthday Problem"! – Enrico M. Nov 23 '15 at 18:36
  • Yes! I did not know that it was a "known" problem before asking the question. I should have used Google too :). Btw key paragraph to un-blow the mind of the non mathematically incline among us, "Although the number of pairings in a group of people is not statistically equivalent to pairs chosen independently, the birthday problem becomes less surprising if a group is thought of in terms of the number of possible pairs, rather than as the number of individuals" – Luj Reyes Nov 23 '15 at 18:45
0

We let $S$ be a set of $N$ people and let $B$ be the set of dates in a year.

Let us then create the birthday function $b: S \longmapsto B$ which states that everyone in $S$ has a unique birthday provided that our function is injective.

Once we have all this figured out, we wish to consider how many functions and how many injective functions that exist between $S$ and $B$. Since we have $|S| = N$ and $|B| = 365$ we can easily see that there are $365^{N}$ possible functions and thus $\frac{365!}{(365-N)!}$ injective functions.

Now, let us go into some actual probability. We define a statement $X$ to: "Everyone in the set $S$ has a unique birthday". What is then the probability that everyone in our set $S$ has a unique birthday? $$ P(A) = \frac{365!}{365^{N}(365-N)!}$$

Does this remind you of anything? Right, it is the the injective functions divided by all possible functions. As you remember everyone in $S$ only has a unique birthday provided that our function is injective.

However, we wish to declare another statement, namely $A'$ which is the statement that there is not only one person assigned to each birthdate.

Thus, $$ P(A') = 1-P(A)$$ and you can simply replace $P(A)$ with our previous equation and then solve for $n = 23$ and you'll see that $P(A')$ will be approximately $50,7$ percent.

  • This is basically the approach you will see from any book in probability.