Why is the Derangement Probability so Close to $\frac{1}{e}$?

Question

A derangement is a permutation $\sigma$ of $\{1,2,3,\dots,n\}$ such that $\sigma(i) \neq i$ for every $i$. A common application of inclusion/exclusion in undergraduate combinatorics and probability classes is to compute the number of derangements, and in the process show that the probability a random permutation is a derangement approaches $\frac{1}{e}$ for large $n$.

There's also a "standard" intuition for this probability, which goes roughly as follows: Let $E_i$ be the event that $\sigma(i)=i$.

1) For a given $i$, the probability of $E_i$ is exactly $\frac{1}{n}$.

2) If $n$ is large, than these events should be "nearly" independent ($E_i$ occurring means that $\sigma(i) \neq j$, making it a tiny bit more likely that $\sigma(j)=j$, but this shouldn't have much of an effect for large $n$), so we'd expect the probability none of the $E_i$ occur to be roughly $\left(1-\frac{1}{n}\right)^n$.

3) For large $n$, $\left(1-\frac{1}{n} \right)^n \approx \frac{1}{e}$.

Now the last approximation alone already has an error proportional to $\frac{1}{n}$. What's suprising then is that, after working through the inclusion/exclusion, you find that the probability is not just approximately $\frac{1}{e}$, but incredibly close -- the error is less than $\frac{1}{(n+1)!}$.

Is there some alternative intuitive explanation for the $\frac{1}{e}$ asymptotic probability that gives a sense of why the convergence is so fast?

Not an intuitive explanation, but since no one has mentioned it: the probability is $\frac{\lfloor\frac{n!}{e}+\frac{1}{2}\rfloor}{n!}$ (since the number of derangements is $\lfloor\frac{n!}{e}+\frac{1}{2}\rfloor$), or i.e. $\frac{\left[\frac{n!}{e}\right]}{n!}$, where $[x]$ is the nearest integer to $x$. This intuitively gets close to $\frac{1}{e}$, since the numerator of $\frac{\frac{n!}{e}}{n!}$ is off by at most $\frac{1}{2}$, and this error gets less and less significant as $n$ gets huge. — user26486, Feb 20 '15 at 23:24
I still find it surprising and fascinating that this probability doesn't obey an (abstract) 0-1 law; that there is a nontrivial limiting probability is an amazing thing to me. — Steven Stadnicki, Feb 20 '15 at 23:56
doesn't it seem odd the error is less than $\frac{1}{(n+1)!} $ but $S_n$ has only $n!$ elements? — cactus314, Feb 21 '15 at 14:19
@johnmangual: An integer approximating an integer in the interval $[0,n!]$ either has zero error or error greater than one part in $n!.$ A real number approximating an integer in that interval can, of course, get much closer. I found it surprising (until I thought about it a bit) that the real number $n!/e$ is nearly an integer. Not only is it nearly an integer--it's nearly the right integer. — Will Orrick, Feb 21 '15 at 23:19
@KevinCostello: It's hard for me to imagine what sort of argument might meet your requirements. You point out that $(1-1/n)^n$ doesn't converge to $1/e$ all that quickly. Wouldn't you need to start with a fast-converging characterization of $1/e$ to have any hope of doing what you want? The power series, of course, is such a fast-converging characterization, but then the original inclusion-exclusion argument seems the most direct way to relate the derangements problem to that characterization. (Which seems to be the route most of the posters attempting to answer are taking.) — Will Orrick, Feb 21 '15 at 23:27

score 18 · Answer 1 · answered May 22 '13 at 18:29

In fact a much stronger $e$-related statement is true: let $X_i$ denote the number of $i$-cycles in a random permutation on $n$ elements. Then for fixed $k$, as $n \to \infty$ the random variables $X_1, X_2, ... X_k$ are asymptotically independently Poisson with rates $1, \frac{1}{2}, ... \frac{1}{k}$. This observation about derangements is a special case applied to $X_1$. See this blog post for details, which proves this fact as a corollary of the exponential formula. The convergence rate is presumably also controlled by the exponential formula but I haven't worked out the details.

... where the $n\to\infty$ is needed for the asymptotic Poisson, while e.g. $E[X_1]=1$ holds exactly. — Hagen von Eitzen, May 22 '13 at 18:31

score 11 · Answer 2 · edited Nov 17 '20 at 19:16

Note: The convergence is so fast, because the derangement number $D_n$ and the Taylor series expansion of $e^x$ are closely related.

The intuition behind it is (for me) trying to develop a better feeling for the mechanism of the inclusion/exclusion principle, which encodes this relationship.

According to OPs referenced paper we know the derangement number $D_n$ is

\begin{align*} D_n=n!\sum_{j=0}^n(-1)^j\frac{1}{j!} \end{align*} Let's compare it with the Taylor expansion series of $e^x$ at $x=-1$ \begin{align*} \frac{1}{e}=\sum_{j=0}^\infty(-1)^j\frac{1}{j!} \end{align*}

Observe, that $\frac{D_n}{n!}$ is the $n$-th Taylor polynom of $\frac{1}{e}$.

We also note, that the series for $e^{-1}$ satifies the requirement of the alternating series test.

The terms alternate in sign.
They are decreasing in absolute value.
They approach $0$.

Therefore applying the alternating series error test we obtain if we stop at the $n$-th term an absolute error less than the $(n+1)$-st term. So, the error is

\begin{align*} \left|\frac{D_n}{n!}-\frac{1}{e}\right|<\frac{1}{(n+1)!} \end{align*}

Note: This and the nice fact that $D_n$ is the nearest integer to $\frac{n!}{e}$ can be found e.g. in Stirling’s Approximation and Derangement Numbers by T. Zaslavsky

score 5 · Answer 3 · edited Apr 13 '17 at 12:21

I'm not sure this is really better than the inclusion exclusion, but it is different:

Let $d_n$ be the number of derangements. Then we have $$d_n = (n-1) (d_{n-1} + d_{n-2})$$ (see here). Let $p_n = d_n/n!$ be the probability of a derangement. Then $$p_n = \frac{n-1}{n} p_{n-1} + \frac{1}{n} p_{n-2} \ \mbox{which implies} \ p_{n} - p_{n-1} = - \frac{p_{n-1} - p_{n-2}}{n} .$$

So $p_{n} - p_{n-1}$ decays factorially fast, and thus $p_n = \sum_{m=2}^n (p_m - p_{m-1})$ is factorially close to $\sum_{m=2}^{\infty} (p_m - p_{m-1}) = \lim_{n \to \infty} p_n$.

score 4 · Answer 4 · answered Feb 21 '15 at 14:06

I agree with Quiaochu that your question is very general. In the paper Limits of logarithmic combinatorial structures by Arratia-Barbour-Tavaré, they discuss the conditioning relation. Let $C_1, \dots, C_n$ be the number of "components" of various sizes (e.g. cycles in a permutaiton). Then using conditional probability

$$ \mathbb{P}\big[ C_1 = c_1, \dots, C_N = c_N \big] = \mathbb{P}\big[ Z_1 = c_1, \dots, Z_N = c_N \bigg| \sum_{i=1}^N i Z_i = N \big]$$

where $Z_i$ are independent random variables. Here $N = \sum i Z_i$ is the size of our combinatorial structure.

Cycles of a random permutation definitely have this property. Since we are worried the events $E_i = \{ \sigma(i) = i\} $ are not quite independent, let's just forget it.

Let's just build a random permutation with independent random numbers $C_k$ of cycles of various sizes $1 \leq k \leq N$ and the size of this permutation is $N = \sum k C_k$. Then we pick the permutation of the size we want using generating functions or conditional probability.

The ABT paper requires a second axiom which definitely is true for random permutations. For some parameter $\theta \in \mathbb{R}$:

$$ k \cdot \mathbb{E}[Z_i] = \theta \tag{$\ast$} $$

In fact, what you will typically get is that $Z_k$ is Poisson distributed with mean $\frac{\theta}{k}$.

This technique is called Poissonization and it's used here and there in probability, combinatorics and statistical physics.

Ordered Cycle Length in a Random Permutation L. Sheep and S. Lloyd
The number of cycles of a random permutation
Statistical Physics (pdf) see the discussion of Grand Canonical Ensemble

I think it's interesting to note that Larry Shepp is Professor at the Statistics Department of Wharton Business School, although his health is deteriorating.

score 3 · Answer 5 · answered Feb 22 '15 at 21:38

This is a great question bridging mathematics and philosophy. It raises the question “why should the number $e$ show up here at all?” It also got me thinking about what “intuitive” means; for example, the fact that $(1-1/n)^n\to1/e$ is not particularly intuitive unless you've seen this particular computation before. I’d like to interpret “intuitive” to mean “easy to see without much computation, assuming familiarity with some widely applicable general techniques.” Given that, there's a nice general way to see, all at once, why $e$ shows up when considering derangements, why the limit is $1/e$ in particular, and why the convergence is so fast:

In a nutshell, $e$ shows up since the definition of derangements can be expressed as a binomial convolution, leading to an expression of the exponential generating function $D(z)$ for the number $D_n$ of derangements as $e^{-z}/(1-z)$. $D(z)$ has a unique pole at $z=1$, which is simple with residue $e^{-1}$, so it’s immediate that $D_n/n!\to e^{-1}$. The convergence is fast in essence because there are no other poles and because the series for $e^{-z}$ converges quickly.

While I think this is intuitive, it hides a lot, so here are a few details. A good reference for the following is Philippe Flajolet and Robert Sedgewick, Analytic Combinatorics, Cambridge University Press (2009). Let $D_n$ denote the number of derangements of an $n$-set $X$. To understand $D_n$, let's count all permutations of $X$, grouped according to the number of fixed points. Given a $k$-element subset of $X$, there are exactly $D_{n-k}$ permutations which fix exactly these $k$ points (since the other $n-k$ must all move). As there are $\binom{n}{k}$ such subsets, we see immediately that $$ n! = \sum_{k=0}^n \binom{n}{k} D_{n-k}.\ (1)$$

The sum on the right-hand side is a “binomial convolution” of $\{D_n\}$ with the constant sequence $\{1\}$, so we’ll get a simple formula if we write down the exponential generating functions (egfs) of the sequences involved (e.g. see the wikipedia article.) The reason is that if $A(z)=\sum a_n z^n/n!$ and $B(z)=\sum b_n z^n/n!$ are the egfs of the sequences $\{a_n\}$ and $\{b_n\}$, then the egf of the binomial convolution $c_n=\sum_k \binom{n}{k}a_k b_{n-k}$ is just the product $A(z)B(z)$.

Now, the egf of $\{n!\}$ is $1/(1-z)$, and the egf of the constant sequence $\{1\}$ is $e^z$. If we let $D(z)$ denote the egf of the derangement sequence $\{D_n\}$, then the binomial convolution $(1)$ becomes $$\frac1{1-z}=e^z D(z).$$

Thus we see $D(z)=e^{-z}/(1-z)$. The asymptotic behavior of the coefficients $D_n/n!$ is governed by the local behavior at the pole $z=1$; since $D(z) \approx e^{-1}/(1-z) = e^{-1}(1+z+z^2+\cdots)$ near the pole at $z=1$, we see immediately that $D_n/n!\to e^{-1}$. Intuitively, the convergence is rapid because the series for $e^{-z}$ converges quickly and there are no other poles to interfere.

Nicolas · Answer 6 · 2015-02-21T12:49:33.480

Let be $N_{n}$ the number of permutations in $\mathfrak{S}_{n}$ with no fixed points. Let be $A_{i}$ the set of permutations of $\left\{ 1,\ldots,n\right\}$ that have $i$ as fixed point. We have $$\#\left(A_{i}\right)=\left(n-1\right)!$$ since $A_{i}$ coressponds to $\mathfrak{S}_{n-1}$ (permutations on $\left\{ 1,\ldots,i-1,i+1,\ldots,n\right\}$ ). Then, the number of permutations that have exactly $i_{1},\ldots,i_{k}\in\left\{ 1,\ldots,n\right\}$ as fixed points is$$\#\left(\bigcap_{j=i_{1},\ldots,i_{k}}A_{j}\right)=\left(n-k\right)!$$ since these permutations are in all the $A_{i_{1}},\ldots,A_{i_{k}}$.

Now we use the Poincaré formula (or also the inclusion/exclusion principle mentioned in the OP) :$$\#\left(\bigcup_{i}^{n}A_{i}\right)=\sum_{I\subset\left\{ 1,\ldots,n\right\} ,I\neq\emptyset}\left(-1\right)^{\#I+1}\#\left(\bigcap_{i\in I}A_{i}\right)=\sum_{I\subset\left\{ 1,\ldots,n\right\} ,I\neq\emptyset}\left(-1\right)^{\#I+1}\left(n-\#I\right)!=\sum_{k=1}^{n}\begin{pmatrix}n\\ k \end{pmatrix}\left(-1\right)^{k+1}\left(n-k\right)!$$ whence$$N_{n}=\#\left(\mathfrak{S}_{n}\setminus\bigcup_{i}^{n}A_{i}\right)=n!-\#\left(\bigcup_{i}^{n}A_{i}\right)=n!-\sum_{k=1}^{n}\begin{pmatrix}n\\ k \end{pmatrix}\left(-1\right)^{k+1}\left(n-k\right)!$$$$=\sum_{k=0}^{n}\left(-1\right)^{k}\frac{n!}{k!}.$$ Then, to have the probability $p$ to have a derangement, we divide by the number of all the possible permutations and get $$p=\sum_{k=0}^{n}\frac{\left(-1\right)^{k}}{k!}\longrightarrow e^{-1}$$ when $n\rightarrow+\infty$.

Here my probability space $\left(\Omega,\mathcal{F},\mathbb{P}\right)$ is given by $\Omega=\mathfrak{S}_{n}$ , $\mathcal{F}=\mathcal{P}\left(\Omega\right)$ and $\mathbb{P}\left(A\right)=\frac{\#A}{n!}$ for all $A\subset\mathfrak{S}_{n}$ .

Why is the Derangement Probability so Close to $\frac{1}{e}$?

6 Answers6

Linked

Related