19

I got this sum, in some work related to another question:

$$S_m=\sum_{k=1}^m \frac{1}{k}{m \choose k} $$

Are there any known results about this (bounds, asymptotics)?

leonbloy
  • 63,430

7 Answers7

21

You know that $\displaystyle (x + 1)^m = \sum_{k=0}^m {m \choose k} x^k$. So $$\int_0^1 \frac{(x + 1)^m - 1}{x} \, dx = \sum_{k=1}^m {m \choose k} \frac{1}{k}.$$

Letting $y = x + 1$ this is just $$\int_1^2 \frac{y^m - 1}{y - 1} \, dy = \sum_{k=1}^m \frac{2^k - 1}{k}.$$

The contribution of the $-1$ terms is $-H_m \sim - \log m$, so let's concentrate on estimating $$T_m = \sum_{k=1}^m \frac{2^k}{k}.$$

There is an obvious lower bound $$T_m \ge \sum_{k=1}^m \frac{2^k}{m} = \frac{2^{m+1} - 2}{m}.$$

To get an upper bound, we'll split the sum as $$\sum_{k=1}^{m-r} \frac{2^k}{k} + \sum_{k=m-r+1}^m \frac{2^k}{k}$$

for some $r$ (depending on $m$, although we do not specify the dependence for now). (Thanks to leonbloy for improvements in the part of the argument that follows!) Noting that $f(k) = \frac{2^k}{k}$ is an increasing function of $k$, the first part is bounded above by $(m-r) \frac{2^{m-r}}{m-r} = 2^{m-r}$ while the second is bounded above by $$\sum_{k=m-r+1}^m \frac{2^k}{m-r+1} \le \frac{1}{m-r+1} \sum_{k=0}^m 2^k < \frac{2^{m+1}}{m-r+1}.$$ The first part gets smaller as $r$ increases while the second gets larger; to minimize their sum it is generally a good idea to make the two parts about as equal as possible. Thus we want $2^{m-r} \approx \frac{2^{m+1}}{m-r+1}$, or $$(m-r+1) \approx 2^{r+1}.$$

This gives $r \approx \log_2 m$. Setting $r = (1 + \epsilon) \log_2 m$ for some $\epsilon > 0$ makes the first part negligible relative to the second without really increasing the second and together with the lower bound gives an asymptotic $$T_m \sim \frac{2^{m+1}}{m}.$$

Edited by leonbloy: With respect to the upper bound - we have that, for any $r = 1..m$:

$$T_m \le 2^{m-r} + \frac{2^{m+1}}{m-r+1} = 2^m \left(2^{-r} + \frac{2}{m-r+1}\right)$$

The expression in parentheses, thought as a function of $r$ (continuous), has a global minimum at $ 2^{r+1} = (m+1-r)^2 \log(2)$, which for large $m$ would give $r\approx 2 \log_2(m)$. Inspired by this, we can choose $r=\lceil 2 \log_2(m) \rceil$, and hence we can bound the two terms:

$$r \ge 2 \log_2(m) \Rightarrow 2^{-r} \le \frac{1}{m^2}$$

$$r \le 2 \log_2(m) +1 \Rightarrow \frac{2}{m-r+1} \le \frac{2}{m-2 \log_2(m) }$$

So, finally we have the bounds:

$$ \frac{2}{m}(1+ 2^{-m}) \le \frac{T_m}{2^{m}} \le \frac{2}{m-2 \log_2(m) } +\frac{1}{m^2} $$

which, agrees with the above asymptotic.

leonbloy
  • 63,430
Qiaochu Yuan
  • 419,620
  • 2
    I'm not sure how you got the bounds (two parts). I get that ("natural" for me) bounds are $ 2^{m-r}$ (quite more tight) and $2^{m+1}/(m+1-r)$. Could you pls check your derivation? – leonbloy Jun 05 '12 at 14:05
  • @leon: I took upper bounds for the terms in each part and multiplied by the number of terms, but you're right that the bounds can be improved substantially. And this removes the logarithmic factor! – Qiaochu Yuan Jun 05 '12 at 16:10
  • that's right. But still I object "to minimize their sum it is generally a good idea"; I think that applies when one has something like $a r + b/r$, but not here, I think I could improve that, but it would be lengthy for a comment - it's ok to edit (append) this answer or should I create my own? – leonbloy Jun 05 '12 at 19:03
  • @leon: I think your own answer would be appropriate. I have some separate improvements to make to this answer anyway. – Qiaochu Yuan Jun 05 '12 at 19:33
  • oops too late... well, perhaps you'd prefer to mix my edit in your anwer, do as you like. – leonbloy Jun 05 '12 at 20:00
10

I was following a path similar to @QiaochuYuan's, but he beat me to it!

Let's try another approach. If $m$ is large, we can use the de Moivre-Laplace theorem. Then $$\begin{equation*} S_m \simeq 2^m \int_1^\infty dk\, \frac{1}{\sqrt{2\pi}\sigma} e^{-(k-\mu)^2/(2\sigma^2)} \frac{1}{k}, \tag{1} \end{equation*}$$ where $\mu = m/2$ and $\sigma^2 = m/4$. (We have $p=q=1/2$.) The integral is dominated by $k\simeq \mu$. Therefore, $$\begin{eqnarray*} S_m &\simeq& 2^m \frac{2}{m} \int_1^\infty dk\, \frac{1}{\sqrt{2\pi}\sigma} e^{-(k-\mu)^2/(2\sigma^2)} \\ &\simeq& \frac{2^{m+1}}{m} \int_{-\infty}^\infty dk\, \frac{1}{\sqrt{2\pi}\sigma} e^{-(k-\mu)^2/(2\sigma^2)} \end{eqnarray*}$$ Thus, $$\begin{equation*} S_m \simeq \frac{2^{m+1}}{m}.\tag{2} \end{equation*}$$ This gives a good numerical approximation to the sum for large $m$.

We can get a better approximation by applying the saddle point method to (1). We find $$\begin{equation*} S_m \simeq \frac{2^{m+1}}{m} \left(1+\frac{1}{m} + O\left(\frac{1}{m^2}\right)\right).\tag{3} \end{equation*}$$

For $m=100$, (2) and (3) agree with the sum to $1\%$ and $0.03\%$, respectively.

leonbloy
  • 63,430
user26872
  • 19,465
  • +1 The second order approximation by saddle point method coincides with what I got approximating $E((g(x))$ via a Taylor expansion around the mean, with $g(x)=1/x$ – leonbloy Jun 05 '12 at 15:35
  • @leonbloy: That's another way to do it. Cheers! – user26872 Jun 05 '12 at 18:21
5

We will make use of Euler-Maclaurin to get the asymptotic. The final asymptotic is $$\sum_{k=1}^{m} \dfrac1k\dbinom{m}{k} = \dfrac{2^{m+1}}{m} \left(\sum_{n=0}^{N} \dfrac{2^n \Gamma(n+1/2)}{m^n \sqrt{\pi}} \right) + \mathcal{O} \left( \dfrac{2^{m+1}}{m^{N+2}}\right)$$ Let us denote $\displaystyle \sum_{k=1}^{m} \dfrac1k\dbinom{m}{k}$ as $S$ i.e $$S = \sum_{k=1}^{m} \dfrac1k\dbinom{m}{k}$$ Let $f(k) = \dfrac1k$ and $A_m(k) = \displaystyle\sum_{n=1}^{k} \dbinom{m}{n}$. Hence, $$S = \sum_{k=1}^{m} f(n) \left( A_m(k) - A_m(k-1)\right) = \int_1^m f(t) dA_m(t) = \left. f(t)A_m(t) \right \rvert_{1}^m - \int_1^m A_m(t) df(t)\\ =\dfrac{2^m}{m} - \dfrac{m}{1} + \int_1^m \dfrac{A_m(t)}{t^2} dt$$ Now for large $m$, by central limit theorem, $$A_m(t) \sim \dfrac{2^m}{\sqrt{2 \pi \dfrac{m}4}} \int_1^t \exp \left( - \dfrac{(x-m/2)^2}{m/2}\right) dx = \dfrac{2^{m+1}}{\sqrt{2 \pi m}} \int_1^t \exp \left( - \dfrac{(x-m/2)^2}{m/2}\right) dx$$ Hence, we want an estimate for the integral $$I = \int_1^m \dfrac1{t^2} \int_1^t \exp \left( - \dfrac{(x-m/2)^2}{m/2}\right) dx \, dt$$ $$\int_1^m \int_x^m \dfrac1{t^2} \exp \left( - \dfrac{(x-m/2)^2}{m/2}\right) dt \, dx= \int_1^m \left( \dfrac1x - \dfrac1m \right) \exp \left( - \dfrac{(x-m/2)^2}{m/2}\right) dx$$ If we let $\dfrac{x-m/2}{\sqrt{m/2}} = y$, and let $s = \sqrt{m/2}$, then $$I = \displaystyle\int_{-s + 1/s}^{s} \left( \dfrac1{s^2 + ys} - \dfrac1{2s^2}\right) \exp(-y^2) s dy$$ $$I = \displaystyle\int_{-s + 1/s}^{s} \left( \dfrac1{s+y} - \dfrac1{2s}\right) \exp(-y^2) dy$$ $$I = \dfrac1s \displaystyle\int_{-s + 1/s}^{s} \left( \dfrac1{1+y/s} - \dfrac12\right) \exp(-y^2) dy$$ $$I = \dfrac1s \displaystyle\int_{-s + 1/s}^{s} \left( \dfrac12 - \dfrac{y}{s} + \dfrac{y^2}{s^2} - \dfrac{y^3}{s^3} + \dfrac{y^4}{s^4} \pm \cdots \right) \exp(-y^2) dy$$ For large $s$, $$\displaystyle \int_{-s+1/s}^s \exp(-y^2) dy = \sqrt{\pi} + \text{ some exponentially decaying error term}$$ $$\displaystyle \int_{-s+1/s}^s y \exp(-y^2) dy = 0 + \text{ some exponentially decaying error term}$$ $$\displaystyle \int_{-s+1/s}^s y^2 \exp(-y^2) dy = \dfrac{\sqrt{\pi}}{2} + \text{ some exponentially decaying error term}$$ $$\displaystyle \int_{-s+1/s}^s y^3 \exp(-y^2) dy = 0 + \text{ some exponentially decaying error term}$$ $$\displaystyle \int_{-s+1/s}^s y^4 \exp(-y^2) dy = \dfrac{3\sqrt{\pi}}{4} + \text{ some exponentially decaying error term}$$ Hence, $$I = \dfrac1s \left( \dfrac{\sqrt{\pi}}{2} + \dfrac{\sqrt{\pi}}{2s^2} + \dfrac{3 \sqrt{\pi}}{4s^4} + \mathcal{O} \left( \dfrac1{s^6} \right)\right)$$ Putting these together we get that $$S = \dfrac{2^m}{m} - m + \dfrac{2^{m+1}}{\sqrt{2 \pi m}} \times \left( \dfrac{\sqrt{2}}{\sqrt{m}} \left(\dfrac{\sqrt{\pi}}{2} + \dfrac{\sqrt{\pi}}{m} + \dfrac{3 \sqrt{\pi}}{m^2} + \mathcal{O} \left( \dfrac1{m^3} \right) \right)\right)$$ Hence, we get that $$S = \dfrac{2^m}{m} - m + \dfrac{2^{m+1}}{m} \times \left( \dfrac12 + \dfrac1m + \dfrac3{m^2} + \mathcal{O} \left( \dfrac1{m^3} \right) \right)$$ Hence, we get that $$S = \dfrac{2^{m+1}}{m} \left( 1 + \dfrac1m + \dfrac3{m^2} + \mathcal{O} \left(\dfrac1{m^3} \right) \right)$$ Extending this we see that $$S = \dfrac{2^{m+1}}{m} \left(\sum_{n=0}^{N} \dfrac{2^n \Gamma(n+1/2)}{m^n \sqrt{\pi}} \right) + \mathcal{O} \left( \dfrac{2^{m+1}}{m^{N+2}}\right)$$ i.e. writing out the first few terms, we get that $$S = \dfrac{2^{m+1}}{m} \left( 1 + \dfrac1m + \dfrac3{m^2} + \dfrac{15}{m^3} + \dfrac{105}{m^4} + \cdots \right)$$ One of the advantages is that it is easy to compute better asymptotic. The saddle method by oenamen can also be extended to get better asymptotic.

  • Thanks. This seems very similar to the approach I mentioned to oenamen in the comment. However, when you mention "better error bounds", are you meaning better orders of asympotic approximation? I don't see any real bounds here. – leonbloy Jun 05 '12 at 22:04
  • Yes I mean better orders of asymptotic approximations like $\mathcal{O}(1/m^4)$, $\mathcal{O}(1/m^6)$ etc. –  Jun 05 '12 at 22:04
4

Consider a random task as follows. First, one chooses a nonempty subset $X$ of $\{1,2,\ldots,m\}$, each with equal probability. Then, one uniformly randomly selects an element $n$ of $X$. The event of interest is when $n=\max(X)$.

Fix $k\in\{1,2,\ldots,m\}$. The probability that $|X|=k$ is $\frac{1}{2^m-1}\,\binom{m}{k}$. The probability that the maximum element of $X$, given $X$ with $|X|=k$, is chosen is $\frac{1}{k}$. Consequently, the probability that the desired event happens is given by $$\sum_{k=1}^m\,\left(\frac{1}{2^m-1}\,\binom{m}{k}\right)\,\left(\frac{1}{k}\right)=\frac{S_m}{2^m-1}\,.$$

Now, consider a fixed element $n\in\{1,2,\ldots,m\}$. Then, there are $\binom{n-1}{j-1}$ possible subsets $X$ of $\{1,2,\ldots,m\}$ such that $n=\max(X)$ and $|X|=j$. The probability of getting such an $X$ is $\frac{\binom{n-1}{j-1}}{2^m-1}$. The probability that $n=\max(X)$, given $X$, is $\frac{1}{j}$. That is, the probability that $n=\max(X)$ is $$\begin{align} \sum_{j=1}^{n}\,\left(\frac{\binom{n-1}{j-1}}{2^m-1}\right)\,\left(\frac{1}{j}\right) &=\frac{1}{2^m-1}\,\sum_{j=1}^n\,\frac1j\,\binom{n-1}{j-1}=\frac{1}{2^m-1}\,\left(\frac{1}{n}\,\sum_{j=1}^n\,\frac{n}{j}\,\binom{n-1}{j-1}\right) \\ &=\frac{1}{2^m-1}\,\left(\frac{1}{n}\,\sum_{j=1}^n\,\binom{n}{j}\right)=\frac{1}{2^m-1}\left(\frac{2^n-1}{n}\right)\,. \end{align}$$

Finally, it follows that $\frac{S_m}{2^m-1}=\sum_{n=1}^m\,\frac{1}{2^m-1}\left(\frac{2^n-1}{n}\right)$. Hence, $$\sum_{k=1}^m\,\frac{1}{k}\,\binom{m}{k}=S_m=\sum_{n=1}^m\,\left(\frac{2^n-1}{n}\right)\,.$$ It can then be shown by induction on $m>3$ that $\frac{2^{m+1}}{m}<S_m<\frac{2^{m+1}}{m}\left(1+\frac{2}{m}\right)$.

Batominovski
  • 49,629
3

A rather simple and gross bound is $$\sum_{k=1}^m\frac{1}{k}\binom{m}{k}\leq\sum_{k=1}^m\binom{m}{k}<\sum_{k=0}^m\binom{m}{k}=2^m$$

DonAntonio
  • 211,718
  • 17
  • 136
  • 287
0

Is there a "story" about $S_{m} = \sum_{k=1}^{m} \binom{m}{k}/k$ that would get to $\frac{2^{m+1}}{m}$ faster? For example, maybe the 2 expressions are approximately equivalent ways of counting something about the subsets of a set of cardinal $m$ as $m$ gets large. I'm struck by the fact that $\binom{m}{k}$ is the number of subsets of size $k$ of a set of size $m$, and $2^{m}$ is the number of subsets of a set of size $m$.

Frank
  • 1,837
0

An exact answer is readily obtainable for the slightly modified problem:

$$\int_{0}^1 \displaystyle (x + 1)^m dx = \int_{0}^1 \sum_{k=0}^m {m \choose k} x^k = \sum_{k=0}^m {m \choose k} \frac{1}{k+1} \implies$$

$$ \implies S_1 = \sum_{k=1}^m {m \choose k} \frac{1}{k+1} = \frac{2^{m+1}-1}{m+1} -1 \tag 1$$

The terms of this sum are very near the original, especially over the "region of interest" ($k\approx m/2$), hence the result should be asympotically correct for our sum $S = \sum_{k=1}^m {m \choose k} \frac{1}{k}$

To formalize this

$$ \frac{1}{k} - \frac{1}{k+1}=\frac{1}{k(k+1)}$$

And recall this lemma: for a Binomial $(m,1/2)$ random variable $X$, lets divide its support into regions $A,B$ corresponding to $x\in A \iff x < m/4$. Let $g(x)$ be a function hat has upper bound $g_A$ and $g_B$ in each regions resp. Then, by the Chernoff bound:

$$E[g(X)] \le \exp(-m^2/8) g_A + g_B$$

In our case, for $g(x)=1/x(x+1)$ (with $g(0)=0$), the bounds are $g_A=1/2$, $g_B=(4/m)^2$

Then

$$ S_2= \sum_{k=1}^m {m \choose k}\frac{1}{k(k+1)} = 2^{m }E[ g(X) ] \le 2^{m } (\exp(-m^2/8) \frac12+ (4/m)^2) \approx 2^{m} \frac{4}{m^2}$$

and

$$ S= \sum_{k=1}^m {m \choose k} \frac{1}{k} = S_1 -S_2 \approx \frac{2^{m+1}}{m+1}=\frac{2^{m+1}}{m}(1 + O(1/m))$$

leonbloy
  • 63,430