1

In this video,

The youtuber derives the result Stirling approximation for $\log(x)$, by doing this argument

$$ \int_{1}^{q} f(x) \, dx \approx \lim_{ n \to \infty} \sum_{k=1}^n f\left( 1 + k \frac{q-1}{n} \right) \frac{q-1}{n}$$

Now for large intervals that is q is a very large number ( the person in video doesn't state how large it is)

$$ \lim_{n \to \infty} \frac{q-1}{n} \approx 1$$

And hence,

$$ \int_1^q f(x) \, dx \approx \lim_{ n \to \infty} \sum_{k=1}^n f( 1 + k ) $$

Now, this means that

$$ \int_1^q f(x) \, dx \approx f(2) + f(3) + f(4)+\cdots+ f(1+n)$$

Which is almost an unbelievable formula for me... It is so crazy!! Like is this method generalizable to find approximate forms for other functions? What would be the other applications for this??

On rewatching the video this shouldn't work for finding stirling approx of $\ln(x)$.. at 3:47 of the video he says $ \sum_{k=1}^{k=n} \ln (k) = \sum_{k=1}^{k=n} \ln (1+k)$

Which seems rather 'cursed', so I know this formula is actually a real thing because I've seen wiki of euler-maclaurin formula but not sure how to apply it here


My questions:

  1. How big should 'q' be for this to work?
  2. Are there other examples of deriving of approximation formual using this
  3. on page of euler macalaurain formula I saw something similar to this but I don't understand is this 'bernoulli number' thing, can someone explain that?
  4. If someone has time to check the video, why does the reindexing trick work?

Note:

$ \approx$ means approximately

  • 1
  • what is this bernoulli coefficent stuff? I found it a bit hard to comprehend everything – tryst with freedom Jul 17 '20 at 21:49
  • Note we of course need $\lim_{x \to +\infty} f(x) = 0$. – aschepler Jul 17 '20 at 22:09
  • why? do we need that to be true – tryst with freedom Jul 17 '20 at 22:10
  • Oh I see otherwise it becomes non - finite – tryst with freedom Jul 17 '20 at 22:10
  • wait, then why did it work for the log(x)? $ \lim_{x \to \infty} log(x) $ is non zero – tryst with freedom Jul 17 '20 at 22:11
  • $\sum_{k=1}^{k=n}\log x$ looks strange, as it's a sum over $k$ of a summand with no $k$ in it. If that's really what's meant, then it's just $n\log x$, and the right side of that equation is $n\log(1+x)$, and those are certainly not equal. Better have another look at that video! – Gerry Myerson Jul 18 '20 at 09:09
  • oh no I messed up the index of sum – tryst with freedom Jul 18 '20 at 09:13
  • Also, $\int_1^nf(x),dx$ is a function of $n$, while $\lim_{n\to\infty}\sum_{k=1}^nf(a+k)$ is a function of $a$, so that looks pretty suspicious. – Gerry Myerson Jul 18 '20 at 09:13
  • thank you for pointing out my mistake – tryst with freedom Jul 18 '20 at 09:15
  • Your question is all over the place, so it's going to be very hard to figure out what you actually want to know. Some other objections: 1) you use that wavy equals sign, without ever saying what it means to you. 2) your first (wavy) equation has the unexplained symbols $a,b$ in it, and the left side is a function of $n$, while the right side isn't. 3) you talk about "large intervals" but you never define any intervals, and it's not clear what any large interval has to do with wavy equality between $b-a$ and $n$. Continued. – Gerry Myerson Jul 18 '20 at 12:47
  • Continued. You'll have to do a much better job of explaining yourself, if you want to get any kind of useful response to your post. – Gerry Myerson Jul 18 '20 at 12:48
  • I tried fixing it and ntoed your comments – tryst with freedom Jul 18 '20 at 12:53
  • $a$ is still unexplained, intervals is still unexplained, why $(q-1)/n$ should be approximately $1$ is unexplained, a function of $q$ is said to be approximately equal to something that's independent of $q$, no conditions on $f$ are given, I give up. You don't have the vaguest idea of what you want to ask. You have to sit down with someone and have a long talk and maybe if she's bright & patient she'll be able to help you formulate a comprehensible question. Come back when you've done that. – Gerry Myerson Jul 18 '20 at 13:10
  • Oh I was speaking from context of video, I'll try include more of the context. I saw in it, I'm sorry. I do not know what all conditions to add because I am asking about technique which I saw in the video...what I want to ask is more information about the method. As in when I can use it, and also the apparent 'mistake' in arguement which faills to explain the derivation of it – tryst with freedom Jul 18 '20 at 13:12

1 Answers1

2

To be consistent with the question, I'll use your notation rather than the notation in the video (which used $N$ where you use $q$ and $M$ where you used $n$).

The key to the limit of $\frac{q-1}{n}$ is that it is something that is true when $n,q$ "are both equally large". The idea is that $n$ depends on $q$ in some way such that the limit $$ \lim_{n\to\infty} \frac qn = 1. $$ That's what it means for two numbers to be "equally large" in an asymptotic sense. When this is true, it is also true that $\lim_{n\to\infty} \frac {q-1}n = 1.$

Personally, I think this is a very silly way to go about it. Let's just cut to the chase: if we set $n = q-1$ every time, it provides the desired simplification of the formula. It also gives us a Riemann sum of the integral with $\Delta x=1.$ That is what the simplified formula calculates.

This particular choice of $n$ as a function of $q$ also satisfies the condition that $q$ and $n$ are asymptotically "equally large," for whatever that's worth.

Now we come to the choice of $x_k$. The graph in the video seems to be indicating that we choose each $x_k$ somewhere in the middle of its interval in the Riemann sum. But the actual formula for $x_k$ indicates that we are choosing the rightmost point in the interval.

That is, the given formula, $1 + k\left(\frac {q-1}n\right),$ simplifies to $1 + k$ if $n = q-1$, which means that $x_1 = 2$ (for the rectangle between $x=1$ and $x=2$), $x_2 = 3$ (for the rectangle between $x=2$ and $x=3$), and $x_n = q$ (for the rectangle between $x=q-1$ and $x=q$).

The Riemann sum then tells us that

$$ \int_1^q \ln (x)\,\mathrm dx \approx \sum_{k=1}^n \ln(1 + k). $$

Next we have the remarkable claim that since $n$ and $q$ are "equally large," we can just change $n$ to $q$ for the upper index of the sum. In the case where $n = q-1$, that's saying the approximation is just as good if you add $\ln(1+n)$ to the right-hand side. But even if you say $n$ is not exactly $q-1$ but is just "equally large," it still is true that if $1$ is any kind of good approximation for $\frac{q-1}{n}$ (as required to make the video's argument work) then replacing $n$ with $q$ in the sum means that we are increasing the sum by approximately $\ln(1+n).$

The video also messes up the re-indexing, because in order to change $\ln(1+k)$ to $\ln(k)$ while keeping the indexing the same, you have to pretend that it doesn't matter whether the term $\ln(1+q)$ is included in the sum or not.

A better way to deal with the sum is to write

$$ \int_1^q \ln (x)\,\mathrm dx \approx \sum_{k=1}^{q-1} \ln(1 + k), $$

that is, since we must insist that $\frac{q-1}{n}$ is approximately $1$ to make this all work, let's just say it is exactly $1,$ that is, $n = q-1$ (as I proposed earlier), and therefore it is perfectly OK to replace something (in this case, $n$) with something exactly equal (in this case, $q-1$).

Now for the reindexing. For each value of $k,$ let $j = k + 1.$ Then as $k$ runs over the integers from $1$ to $q-1,$ $j$ runs over the integers from $2$ to $q$; and of course $\ln(1+k) = \ln(j).$ So we find that $$ \sum_{k=1}^{q-1} \ln(1 + k) = \sum_{j=2}^{q} \ln(j). $$

But what if we want $j$ to start at $1$ instead of $2$? That just means we have an extra term in the sum:

$$ \ln(1) + \sum_{j=2}^{q} \ln(j) = \sum_{j=1}^{q} \ln(j), $$

and since $\ln(1)=0,$ the left-hand side is just equal to $\sum_{j=2}^{q} \ln(j).$ Putting it all together,

$$ \sum_{k=1}^{q-1} \ln(1 + k) = \sum_{j=1}^{q} \ln(j). $$

Therefore

$$ \int_1^q \ln (x)\,\mathrm dx \approx \sum_{j=1}^{q} \ln(j). $$

Now we can simply rename the index variable from $j$ to $k$ in the sum, and you have the final approximation shown in the video, except that we got there without introducing one silly error and then introducing another error that just happens to cancel the first one.

Evaluating the integral exactly, we get

$$ \int_1^q \ln (x)\,\mathrm dx = q\ln(q) - q + 1 $$

(not $q\ln(q) - q$ as claimed in the video). So if we accept then that

$$ \int_1^q \ln (x)\,\mathrm dx \approx \sum_{k=1}^{q} \ln(k) = \ln(q!), $$

this says that $$ q\ln(q) - q + 1 \approx \ln(q!). $$

But Stirling's approximation is usually stated as an approximate formula for $q!,$ not $\ln(q!).$ Taking the exponential function of both sides, we get $$ q! \approx \frac{q^q e}{e^q} = q^q e^{-q+1}. \tag1$$

If we use the final approximation in the video, we get $$ q! \approx \frac{q^q}{e^q} = q^q e^{-q}. \tag2$$

Note that formula $(1)$ is $e$ times as large as formula $(2)$. Try comparing the values produced by these two formulas with the actual value of $q!$ for a few values of $q$ and see how good an approximation you think they give.


The formula that is usually given as Stirling's approximation is $$ q! \approx \left(\sqrt{2\pi q}\right) q^q e^{-q}. \tag3$$

Note that for any $q \geq 2,$ this formula is larger than either approximation $(1)$ or approximation $(2).$ In fact, it's about $0.922 \sqrt n$ times as large as approximation $(1)$, so approximation $(1)$ is actually not very good (it's off by a factor of $9$ for $n=100$) and approximation $(2)$ is even worse.

As it happens, $\ln x$ is an increasing function of $x,$ so the rectangles chosen for the Riemann sum in the video (where the upper right corner of the rectangle is on the curve) are all larger than the corresponding areas under the curve. The right-hand side is therefore an overestimate, never an underestimate of the integral. Again, since the function is increasing, the error is bounded by $\Delta x$ times the total increase of the function, that is, the amount by which $\ln(q!)$ overestimates the integral is between $0$ and $\ln(q).$ The factor $\sqrt{2\pi n}$ comes from an approximation of that error.

David K
  • 98,388