5

I work at a company that posts a birthday calendar. I noticed that there was a string of four consecutive days with no birthdays. What is the probability of that happening?

Problem Statement

Given $n$ people, what is the probability of a observing a birthday calendar with no gaps of length $g$ or greater.

In my case $n = 400$ and $g = 4$. I'm mostly interested in an analytical solution.

Partial Solution

We will count the number of birthday assignments that have gaps less than $g$.

To do this, we will count assignments which have exactly $d$ distinct birthdays ($d = 1, 2, 3, ..., 365$) and sum over $d$.

For a given $d$, we will require a counting of two things:

  1. Number of ways to partition $n$ birthdays among $d$ days.
  2. Number of ways to select $d$ days from the year with no gaps of $g$ or greater.

I found a solution to 1: $S(n,d) \times d!$ where $S(n,d)$ is a Stirling Number Of Second Kind. See solution here:

Consecutive birthdays probability

I need help on 2.

tinlyx
  • 1,534

2 Answers2

3

For each day, $d$, let $E_d$ be the event that there is a birthday on day $d$, but there are not any birthdays on days $d+1,d+2,\dots,d+g$. A gap of length $g$ occurs if and only if $E_d$ occurs for some $d$. That is, $$ P(\text{no gaps of length $g$})=P(E_1^c\cap E_2^c\cap \dots\cap E_{365}^c) $$ To compute this, we use the principle of inclusion exclusion: $$ P(\text{no gaps of length $g$})=\sum_S(-1)^{|S|}P(E_{d(1)}\cap E_{d(2)}\cap \dots \cap E_{d(k)}) $$ where $S=\{d(1),d(2),\dots,d(k)\}$ ranges over all $2^{365}$ subsets of days.

We must figure out the probabilities of the intersections $E_{d(1)}\cap E_{d(2)}\cap \cdots \cap E_{d(k)}$. If any of the intervals $[d(i),d(i)+g]$ and $[d(j),d(j)+g]$ overlap, then this probability of this intersection is zero; the $E_d$ were defined carefully so this would be true. Otherwise, we use the principle of inclusion exclusion on this smaller problem to compute $$ p_k:= P(E_{d(1)}\cap \cdots\cap E_{d(k)}) = \sum_{j=0}^k(-1)^j\binom{k}j\left(1-\frac{kg+j}{365}\right)^n $$ Finally, we must count for each $k$ the number of ways to choose $\{n(1),\dots,n(k)\}\subseteq \{1,2,\dots,365\}$ so the intervals $[n(i),n(i)+g]$ are pairwise non-overlapping. I claim this number is $$ n_k=\binom{365-gk}{k}+g\binom{365-gk-1}{k-1} $$ I leave it to you to verify this is correct. As a hint, the first summand counts choices where none of the gaps cross between two different years, and the second counts ones that do.

We finally get that $$ P(\text{no gaps of length $g$})=\sum_{k=0}^{\left\lfloor \frac{365}{g+1}\right\rfloor }(-1)^{k}n_kp_k $$

Mike Earnest
  • 75,930
  • Thank you for your answer. I understand almost all of it except for the second application of the inclusion-exclusion principle. It looks almost like what I found here:

    https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle#Special_case

    But this expression is for the union of events. Can you explain how to get to your result? What does 'n' represent in your expression? How did you get the term that is raised to the power of n?

    – Vincent Nguyen Jan 28 '19 at 23:44
  • $n$ is the number of people in your office. Look at the $j=0$ term. This is the probability that no one is born on any of the days $d(i)+j$ for $i=1,\dots,k$ and $j=1,\dots,g$. There are $kg$ days to be avoided, and $n$ people, so the probability is $(1-kg/365)^n$. But we also want someone to be born on each day $d(i)$, so we subtract out the events that someone is not born on each of those days. There is an extra day to avoid, so each of these is $(1-(kg+1)/365)^n$. Then we add back in the doubly subtracted events, etc. @VincentNguyen – Mike Earnest Jan 29 '19 at 00:38
1

I'm thinking about it from another approach.

For any specific day (1.1 for example), the probability that none of the 400 people have that day as birthday is $p_1 = (364/365)^{400}$ (assuming each day has equal probability and there is no leap year...)

For a specific length $g$ or consecutive gap (1.1 ~ 1.4 for example). It would be $p_g = ((365-g)/365)^{400}$. (I know it does not work for small number of people, say 2 people and having a gap of 200, but it seems to be at least approximately correct when $n$ is large)

How many such gaps are there? 365.

In summary, my answer is $\approx 1 - 365 * ((365-g)/365)^n$

Update 1

for the probability of having a gap of $g$

$$ P(g) = 365 * ((365-g)/365)^n - \sum_{i=1}^{g-1}P(g+i) $$ It is conditional probability, but we are summing them up so what we should do is to substract the intersection. But this complicate the problem very quickly as you can see for $g>1$ this recursive expansion will eventually reach a gap of 180+ days, in which case our formula does not hold even approximately.

MoonKnight
  • 2,179
  • 1
    $p_g = ((365-g)/365)^{400}$ is true for any four days (does not have to be consecutive). Also I think two gaps are not independent (knowing 1/1 - 1/4 was empty changes the probability of 1/2 - 1/5), so you must include the conditional probability when you calculate the joint distribution via multiplication rule. Furthermore, I think you way of counting would include duplicates if your strategy is to compute $p_g$ for all possible g. – Vincent Nguyen Jan 15 '19 at 23:44
  • Yes it is for any 4 days, the concept of consecutive is a tricky part. I have updated the solution regarding your valid concern. However, there is another tricky part from "consecutive g days" that I could not figure out. For example, if we have 2 people, the probability $p_{200} = p_{300} = p_{364} = 1/365$ and $p_{181} = 3/365, p_{180} = 5/365$. I could not generalize my formula into such large $g$. – MoonKnight Jan 16 '19 at 00:28