8

Summary

I think that the following sum $$\sum_{i=1}^\infty{i\left((1-p^{i+1})^m-(1-p^{i})^m\right)}$$ is $O(\log m)$, where $0<p<1$ and $m \geq 1$. However, I was not able to prove it formally, despite my endless efforts (binomial expansions, derivation, majoration, etc. and combination thereof).

Context

I was studying a variation of the skip list data structure. Each key in a skip list is associated a "tower" of a certain height. Height is chosen by repeatedly tossing a coin: with probability $p$ the tower grows of one level, with probability $1-p$ we stop. Consider $m$ keys and the random variable "maximum level of the towers of that $m$ keys". The above expression is the expected value of this RV. Strangely enough standard literature about skip lists takes a sort of work around to prove that the number of levels are $O(\log n)$ where $n$ is the number of keys in the skip list.

Experiments with Mathematica

It is possible to numerically support my conjecture. The following is a Mathematica experiment.

maxi = 100; 
maxm = 10^9; 
G[m_] := Sum[i*((1 - p^(i + 1))^m - (1 - p^i)^m), {i, 0, maxi}]; 
p = 0.5; 
delta = G[m] - Log[1/p, m] /. m -> maxm; 
DiscretePlot[{G[m], Log[1/p, m]}, {m, 1, maxm, maxm/200}, 
 PlotLegends -> "Expressions"]
DiscretePlot[G[m] - Log[1/p, m] - delta, {m, 1, maxm, maxm/100}]

we get chart of both the sum and the log function up to $m=10^9$ and chart of their difference. The latter chart is a bit funny, I do not know if it is a numerical phenomenon or a ripple of the function itself.

  • Something that could work is splitting the sum. First take $i <= O(\log_{1/p}(m))$ and then the rest. The first sum is bounded by what you want and then for the rest there might be a simple approximation since $p^i << 1/m^k$ for some constant $k$. – Sandeep Silwal Jun 21 '19 at 17:24

1 Answers1

8

Let's denote your sum by $S(m,p)$. First of all, we have $$S(m,p)=\sum_{k=1}^{\infty}\big(1-(1-p^k)^m\big)=\color{blue}{\sum_{j=1}^{m}(-1)^{j-1}\binom{m}{j}\frac{p^j}{1-p^j}}.$$ (The first equality is obtained, with $a_k=1-(1-p^k)^m$, from $$\sum_{k=1}^{n}k(a_k-a_{k+1})=\sum_{k=1}^{n}ka_k-\sum_{k=1}^{n+1}(k-1)a_k=\sum_{k=1}^{n}a_k-na_{n+1};$$ to get the second, expand $(1-p^k)^m$ and sum over $k$ first).

A "tool" for this kind of sums is Nørlund–Rice integral (easily verified by oneself): $$S(m,p)=-\frac{m!}{2\pi i}\oint_{C}F(m,p,z)\,dz,\qquad F(m,p,z)=\frac{(p^{z}-1)^{-1}}{\prod_{j=0}^{m}(z+j)}$$ where $C$ is a closed contour encircling $z=-1,\ldots,-m$ and no other poles of the integrand.

Now, if $C_n$ is the circle $|z|=-(2n+1)\pi/\ln p$ oriented counterclockwise, then $$0=\lim_{n\to\infty}\frac{m!}{2\pi i}\oint_{C_n}F(m,p,z)\,dz=-S(m,p)+m!\sum_{n\in\mathbb{Z}}\operatorname*{Res}_{z=2n\pi i/\ln p}F(m,p,z);$$ evaluation of the residues gives (not just an asymptotic - an exact one!) $$S(m,p)=-\frac{1}{2}-\frac{H_m}{\ln p}+\frac{1}{\pi}\sum_{n=1}^{\infty}\frac{1}{n}\Im\prod_{j=1}^{m}\Big(1+\frac{2n\pi i}{j\ln p}\Big)^{-1},$$ where $H_m=\sum\limits_{j=1}^{m}\dfrac{1}{j}$ is the $m$-th harmonic number.

This confirms $\color{blue}{\dfrac{S(m,p)}{\ln m}\underset{m\to\infty}{\longrightarrow}-\dfrac{1}{\ln p}}$, as $H_m=\ln m+O(1)$ when $m\to\infty$.

metamorphy
  • 39,111