15

I am trying to understand the importance of Ito's Lemma in Stochastic Calculus.

When I learn about some mathematical technique for the first time, I always like to ask questions such as : "Is this complicated approach truly necessary - and what happens if I were to incorrectly persist with a simpler approach? How much trouble can I land myself into by persisting within the incorrect and simpler approach? Are there some situations where this is more of a problem compared to other situations where this might be less of a problem?"

Part 1: For example, consider the following equation:

$$f(t, B_t) = X_t = \mu t + \sigma \log(B_t)$$

Where:

  • $X_t$ is the stochastic process
  • $\mu$ is the drift term
  • $\sigma$ is the volatility term
  • $B_t$ is a geometric Brownian motion.

When it comes to taking the derivative of this equation, there are 3 approaches that come to mind:

Approach 1: Basic Differencing

We can do this by simulating $X_t$ and evaluates consecutive differences: $$df(t, B_t) = f(t, B_t) - f(t_{-1}, B_{t-1}) $$ or $$dX_t = X_t - X_{t-1} $$

Approach 2: Basic Calculus (Incorrect):

If this was a basic calculus derivative, for some generic function $f(x,y)$, I could use chain rule to determine:

$$\frac{df}{dt} = \frac{\partial f}{\partial x}\frac{dx}{dt} + \frac{\partial f}{\partial y}\frac{dy}{dt}$$

$$df = \frac{\partial f}{\partial x}dx + \frac{\partial f}{\partial y}dy$$

Thus, applying the logic of basic calculus incorrectly to stochastic calculus, I would incorrectly determine that:

$$f(t, B_t) = \mu t + \sigma \log(B_t)$$

$$df(t, B_t) = \frac{\partial}{\partial t} f(t,B_t) dt + \frac{\partial}{\partial B_t} f(t, B_t) dB_t$$

$$df(t, B_t) = \mu dt + \sigma \left(\frac{1}{B_t}\right) dB_t$$

Approach 3: Ito's Calculus (Correct):

Just to recap, in a Taylor Expansion of a non-stochastic function, we can show that a first order Taylor Expansion of a function equals to its first derivative (in limit), since all other higher order terms are negligible:

$$f(x + \Delta x) = f(x) + (\Delta x) f'(x) + \frac{(\Delta x)^2}{2} f''(x) + ... $$ $$(\Delta x) f'(x) = f(x + \Delta x) - f(x) - 0.5(\Delta x)^2 f''(x) + ... $$ $$f'(x) = \frac{f(x + \Delta x) - f(x)}{\Delta x} - 0.5 \Delta x f''(x) + ...$$

$$\lim_{\Delta x \to 0} f'(x) = \lim_{\Delta x \to 0} \left( \frac{f(x + \Delta x) - f(x)}{\Delta x} \right) - \lim_{\Delta x \to 0} \left(0.5 \Delta x f''(x)\right) + \lim_{\Delta x \to 0} (....) $$

$$f'(x) = f'(x) - 0$$

$$f'(x) = f'(x) $$

However, this is not true in the Taylor Expansion of a stochastic function:

$$df = \left(\frac{dB_t}{dt} f'(B_t)\right) dt$$

Since the Brownian Motion is not smooth and not differentiable in any interval (i.e. if you zoom into a very small part, there is still "more Brownian Motion" happening). Thus, $dB_t$ is defined, but $\frac{dB_t}{dt}$ is not defined.

In the non-stochastic case, we could have written the Taylor Series this way with all higher order terms dropping off:

$$\Delta f = f(x + \Delta x) - f(x) = (\Delta x) f'(x) + \frac{(\Delta x)^2}{2} f''(x) + ... $$

But, for some stochastic function of a Brownian Motion $f(B_t)$, if we were to write the same Taylor Series:

$$\Delta f = f(B_t + \Delta B_t) - f(B_t) = (\Delta B_t) f'(B_t) + \frac{(\Delta B_t)^2}{2} f''(B_t) + ...$$

We know that $\Delta B_t$ (i.e. two differences in a Brownian Motion) is equal to a Weiner Process, i.e.

$$B_{t+s} - B_s = W_t \sim N(0, t)$$ $$\Delta B_t = B_{t+\Delta t} - B_t = W_t \sim N(0, \Delta t)$$ $$\text{Var}(\Delta B_t^2) = E(\Delta B_t^2) - E(\Delta B_t)^2 = \Delta t - 0 = \Delta t $$ $$E(\Delta B_t^2) = \Delta t $$

Going back to the Taylor Series of the stochastic function, we can now make this replacement for $\Delta t$ using the Expected Value:

$$\Delta f = f(B_t + \Delta B_t) - f(B_t) = (\Delta B_t) f'(B_t) + \frac{(\Delta t)^2}{2} f''(B_t) + ...$$

Now (for some reason that I don't understand), the second order term is not negligible anymore. Using this information, we can now formally write Ito's Lemma as :

$$df(t, B_t) = \frac{\partial f}{\partial t} dt + \frac{\partial f}{\partial B_t} dB_t + \frac{1}{2} \frac{\partial^2 f}{\partial B_t^2} (dB_t)^2$$

Now, going back to our original problem, for the function $f(t, B_t) = X_t = \mu t + \sigma \log(B_t)$, we can write:

  • $\frac{\partial f}{\partial t} = \mu$
  • $\frac{\partial f}{\partial B_t} = \frac{\sigma}{B_t}$
  • $\frac{\partial^2 f}{\partial B_t^2} = -\frac{\sigma}{B_t^2}$

Thus, the final answer is:

$$df(t, B_t) = \mu dt + \frac{\sigma}{B_t} dB_t - \frac{1}{2} \frac{\sigma}{B_t ^2} dt$$ Note that $ Var(B_t) = E(B_t^2) - E(B_t)^2 = t - (0)^2 = t$

Part 2: Looking at Part 1 (provided I have done everything correctly), it is not immediately clear to me what are the "dangers" of incorrectly calculating the derivative (i.e. how much more "wrong" would the incorrect approach be compared to the correct approach). To get a better understanding of this, I tried to make a computer simulation to look at this (using the R programming language):

library(ggplot2)

set.seed(123)

parameters

n <- 1000 dt <- 0.01 mu <- 0.05 sigma <- 0.2 t <- seq(0, (n-1)dt, dt) Bt <- exp(cumsum(rnorm(n, 0, sqrt(dt)))) Xt <- mu t + sigma * log(Bt)

Original stochastic process: Xt = mut + sigma log(Bt)

p1 <- ggplot(data.frame(Time = t, Xt = Xt), aes(Time, Xt)) + geom_line() + labs(title = paste("Original Equation: Xt = mut + sigma log(Bt)\nmu =", mu, ", sigma =", sigma)) + theme_bw()

Approach 1: Basic Differencing

dXt_1 = Xt[i+1] - Xt[i]

dXt_1 <- c(0, diff(Xt)) p2 <- ggplot(data.frame(Time = t, dXt_1 = dXt_1), aes(Time, dXt_1)) + geom_line() + labs(title = paste("Approach 1: Basic Differencing\nXt[i+1] - Xt[i]\nmu =", mu, ", sigma =", sigma)) + theme_bw()

Approach 2: Incorrect Derivative using basic calculus

dXt_2 = mudt + sigma(1/Bt)*dBt

dBt <- c(0, diff(Bt)) dXt_2 <- mu * dt + sigma * (1 / Bt) * dBt p3 <- ggplot(data.frame(Time = t, dXt_2 = dXt_2), aes(Time, dXt_2)) + geom_line() + labs(title = paste("Approach 2: Incorrect Derivative\nmudt + sigma(1/Bt)*dBt\nmu =", mu, ", sigma =", sigma)) + theme_bw()

Approach 3: Correct Derivative using Ito's Lemma

dXt_3 = mudt + (sigma/Bt)dBt - 0.5(sigma/(Bt^2))dt

dXt_3 <- mu * dt + (sigma / Bt) * dBt - 0.5 * (sigma / (t)) * dt p4 <- ggplot(data.frame(Time = t, dXt_3 = dXt_3), aes(Time, dXt_3)) + geom_line() + labs(title = paste("Approach 3: Correct Derivative using Ito's Lemma\nmudt + (sigma/Bt)dBt - 0.5(sigma/(t))dt\nmu =", mu, ", sigma =", sigma)) + theme_bw()

enter image description here

In the above graphs, the results of all 3 approaches look quite similar to one another.

I then compared the absolute differences between Approach 1 and Approach 3, and between Approach 2 and Approach 3:

abs_diff_1_3 <- abs(dXt_1 - dXt_3)
abs_diff_2_3 <- abs(dXt_2 - dXt_3)

p5 <- ggplot(data.frame(Time = t, AbsDiff = abs_diff_1_3), aes(Time, AbsDiff)) + geom_line() + labs(title = "Absolute Difference between Approach 1 and 3") + theme_bw()

p6 <- ggplot(data.frame(Time = t, AbsDiff = abs_diff_2_3), aes(Time, AbsDiff)) + geom_line() + labs(title = "Absolute Difference between Approach 2 and 3") + theme_bw()

enter image description here Again, all results look quite similar to one another.

My Question: Based on this exercise, it seems like whether you take the correct derivative (via Ito's Lemma) or the incorrect derivative (basic calculus), the final answer looks very similar. Thus, is Ito's Lemma more of a theoretical consideration with little added value compared to the incorrect derivative? Or perhaps there are much bigger differences for the derivatives of other functions (and perhaps for stochastic integrals)?

Thus, for stochastic functions, is there any "real danger" in incorrectly calculating their derivatives and integrals using basic calculus methods?

Thanks!

Note: The one thing that comes to mind is perhaps Ito Calculus is really needed when you need to take the derivative or integral of a stochastic function in an intermediate step for some math problem (e.g. first passage time). In such cases, perhaps the "danger" of propagating an incorrect result is much higher compared to these basic simulations.

References:

stats_noob
  • 3,112
  • 4
  • 10
  • 36
  • Interesting [+1]. There is - still - a different approach : see the third example here. – Jean Marie Jan 03 '24 at 08:33
  • Can you integrate each derivative and see how close they match the original, using the same sequence of Bs? – Mark Jan 03 '24 at 08:43
  • 1
    Also instead of log Bt maybe exponential Bt will be a better example since the correction term will get magnified as Bt grows. – Mark Jan 03 '24 at 08:46
  • Why is $(dB_t)^2\neq0$? – Bob Dobbs Jan 03 '24 at 09:23
  • 1
    "Thus, is Ito's Lemma more of a theoretical consideration with little added value compared to the incorrect derivative?" What a nonsense! Using a numerical program you can only look at discrete time steps and when something looks "similar" it does not mean it is the same. I cannot believe that this post got 10 upvotes. – Kurt G. Jan 03 '24 at 09:34
  • 3
    @Kurt G. The value of this post is in its questionning "When I learn about some mathematical technique for the first time, I always like to ask questions such as : "Is this complicated approach truly necessary - and what happens if I were to incorrectly persist with a simpler approach?..." from somebody who has surely not your knowledge but who has the enthousiasm of the youth and is hopefuly more or less self-taught (never forget that a certain proportion of people here belong to this category). Besides, using (OK, thoroughly) numerical programs even in such a context isn't a heresy. – Jean Marie Jan 04 '24 at 23:27
  • 1
    @JeanMarie there is nothing wrong with enthusiasm and I at least try not to blame OPs for not having all the knowledge. Exactly the latter should however led this OP to choose a more modest formulation about the added value of Ito's lemma. In particular when seemingly erroneous code was used to get to that conclusion. – Kurt G. Jan 05 '24 at 11:03

1 Answers1

5

There are a few issues at hand.

First, your code doesn't match your post: where you wrote

dXt_3 <- mu * dt + (sigma / Bt) * dBt - 0.5 * (sigma / (t)) * dt

I think you probably meant

dXt_3 <- mu * dt + (sigma / Bt) * dBt - 0.5 * (sigma / (Bt ^ 2)) * dt.

In which case the absolute difference between the two can be drastic. (Run it yourself!)

More imporantly, is that at the top you declare that $B_t$ is a geometric Brownian motion, and in the code you do in fact approximate Bt $\approx \exp(W_t)$ for some standard Brownian motion $W_t$. However, in the Ito calculus section, you seem to compute $(dB_t)^2 = t$ as if $B_t$ were standard BM as well.

I'm going to assume that you meant that $B_t = \exp(W_t)$ for a standard BM $W_t$ (as in the simulation), in which case $$ X_t = \mu t + \sigma W_t $$ and both "standard calculus" and the Ito lemma give $$ dX_t = \mu dt + \sigma dW_t $$ since the second derivative of $x \mapsto x$ vanishes. (As an aside, if you want to understand where the Ito lemma picks up the second derivative, it is because $\lim_{n \to \infty} \sum_i^n (W_{(i + 1)/n} - W_{i/n})^2 \neq 0$; now look at the Taylor expansion of $f(t, W_t)$.

In fact, $B_t = \exp(W_t)$ satisfies the SDE $$ dB_t = \frac{B_t}{2}dt + B_tdW_t $$ so you can check that if you substitute this alongside $(dB_t)^2 = B_t^2$ (do you know how to compute this quadratic variation from the above SDE?) into your original formula $$ dX_t = \mu dt + \frac{\sigma}{B_t} dB_t + \frac{\sigma}{2B_t^2}dt $$ one does in fact get $$ dX_t = \mu dt + \sigma dW_t $$ as expected, so in this case the Ito lemma and standard calculus agree once we carry out the computation correctly. But this is misleading, since this is practically the only case where this happens.

Instead, let us pick another process such that the Ito formula and standard calculus disagree. Set $W_t$ to be a standard BM and $$ X_t = \mu t + \sigma W_t^3 $$ so that the standard calculus approach gives $$ dX_t = \mu dt + 3\sigma W_t^2 dW_t $$ and the Ito lemma yields $$ dX_t = (\mu + 3 \sigma W_t) dt + 3 \sigma W_t^2 dW_t. $$ Then, running the same code as you, but modified to this process, I get the following results.

library(ggplot2)

set.seed(1)

parameters

n <- 1000 dt <- 0.01 mu <- 0.05 sigma <- 0.2 t <- seq(0, (n-1)dt, dt) Wt <- cumsum(rnorm(n, 0, sqrt(dt))) Xt <- mu t + sigma * (Wt ^ 3)

p1 <- ggplot(data.frame(Time = t, Xt = Xt), aes(Time, Xt)) + geom_line() + labs(title = paste("Original Equation: Xt = mut + sigma Wt^3)\nmu =", mu, ", sigma =", sigma)) + theme_bw()

Approach 1: Basic Differencing

dXt_1 = Xt[i+1] - Xt[i]

dXt_1 <- c(0, diff(Xt)) p2 <- ggplot(data.frame(Time = t, dXt_1 = dXt_1), aes(Time, dXt_1)) + geom_line() + labs(title = paste("Approach 1: Basic Differencing\nXt[i+1] - Xt[i]\nmu =", mu, ", sigma =", sigma)) + theme_bw()

Approach 2: Incorrect Derivative using basic calculus

dWt <- c(0, diff(Wt)) dXt_2 <- mu * dt + 3 * sigma * (Wt ^ 2) * dWt p3 <- ggplot(data.frame(Time = t, dXt_2 = dXt_2), aes(Time, dXt_2)) + geom_line() + labs(title = paste("Approach 2: Incorrect Derivative\nmu =", mu, ", sigma =", sigma)) + theme_bw()

Approach 3: Correct Derivative using Ito's Lemma

dXt_3 <- mu * dt + 3 * sigma * (Wt ^ 2) * dWt + 3 * sigma * Wt * dt p4 <- ggplot(data.frame(Time = t, dXt_3 = dXt_3), aes(Time, dXt_3)) + geom_line() + labs(title = paste("Approach 3: Correct Derivative using Ito's lemma \nmu =", mu, ", sigma =", sigma)) + theme_bw()

abs_diff_1_3 <- abs(dXt_1 - dXt_3) abs_diff_2_3 <- abs(dXt_2 - dXt_3)

p5 <- ggplot(data.frame(Time = t, AbsDiff = abs_diff_1_3), aes(Time, AbsDiff)) + geom_line() + labs(title = "Absolute Difference between Approach 1 and 3") + theme_bw()

p6 <- ggplot(data.frame(Time = t, AbsDiff = abs_diff_2_3), aes(Time, AbsDiff)) + geom_line() + labs(title = "Absolute Difference between Approach 2 and 3") + theme_bw()

Absolute Difference between Approach 2 and 3

Do you still believe that Ito's lemma is just a theoretical exercise? In fact, note that the difference is $3\sigma W_tdt$, so we actually expect the error to grow as $t \to \infty$ in this case! If n = 10000 instead in the above code, we get the following error plot:

Absolute Difference between Approach 2 and 3, n = 10000

daisies
  • 1,538
  • 1
  • 5
  • 1
    No, $B_t^2$ is quite literally $B_t \cdot B_t$; is there a reference for why it should be the variance? Also, note that $\log(B_t)$ is not well-defined for a standard Brownian motion $B_t$ since $B_t$ is not strictly positive and no matter where you start it $\log(B_t)$ goes to $-\infty$ in finite time, so things become kind of tricky. One could work up to a stopping time, but I think that this might just obfuscate the original point. – daisies Jan 05 '24 at 05:48
  • If you would like, I can transcribe how to compute the derivative in the case of $X_t = \log(B_t)$, $B_0 = 1$ and we work only on $t < T$ where $T = \min{ t \mid B_t = 0 }$. It turns out that a local version of Ito holds anyway. – daisies Jan 05 '24 at 05:53
  • sorry - I reverted it back to Geometric Brownian Motion. "Bt <- exp(cumsum(rnorm(n, 0, sqrt(dt))))" is simulating a geometric Brownian motion, correct? – stats_noob Jan 05 '24 at 05:59
  • re: Bt^2 = BtBt ... I thought there is something called "quadratic variation of a brownian motion" which is shows us how to calculate Bt^2 ... and Bt^2 is not simply BtBt. I guess my understanding is wrong? Bt^2 = Bt*Bt ? – stats_noob Jan 05 '24 at 06:00
  • Yes, cumsum(...) is a Brownian motion, and taking an exponential gives a geometric Brownian motion. See https://en.wikipedia.org/wiki/Geometric_Brownian_motion – daisies Jan 05 '24 at 06:01