4

I'm studying probability theory and as it's a known fact in real analysis and measure theory there are modes that a random variable(also any sequence of measurable functions in a given measure space) can converge namely convergence in measure(in the probabilistic sense convergence in probability), convergence in $L^1$ norm(or $L^p$ norm for any $p \geq 1$), convergence almost everywhere(in the probabilistic sense convergence almost surely) and also other modes of convergence that this question is not mentioning.
I'm trying to get a probabilistic intuition about what does convergence in any of these modes means.
For example, I believe for example convergence in probability, means that if one toss a fair coin a lot of times, it will be more and more probable that one gets the ratio of heads and tails closer and closer to $\frac{1}{2}$ (thanks to weak law of large numbers).
but what about other modes of convergences?
Also note that for example convergence a.e implies convergence in probability hence the same argument given above about the fair coin holds again but I think there is something lost in giving this example again since convergence in probability not necessarily implies convergence a.e. (at least certain circumstances are required for such a conclusion to hold and it would be very helpful if one can give some probabilistic intuition about what these circumstances mean)
Thanks.

1 Answers1

1

I always interpreted it like this:

  1. Convergence in probability: this means that the probability of something unusual happening gets smaller and smaller with time, or, more formally, the probability of deviation from the value we are converging to becomes more and more unlikely.

  2. Convergence almost surely: this needs to be stronger than 1., and it can be interpreted as convergence happening with probability 1, while above deviation happens with probability zero. For example, if we know a mayfly consumes $X_n$ amount of food every day, we can say with probability 1 that some day (after its death), it will not consume anything anymore.

  3. Convergence in $L^p$: this one always seemed a lot more formal to me, and I'm not sure I can give much of an intuition here, but it is also known as convergence in mean. This means that, on average, the deviations will be zero as $n$ becomes large. Mostly, I find that this mode of convergence has more technical value however, since we can use it to manipulate integral expressions. Its usefulness is perhaps also observable in the fact that $L^2$ convergence is just MSE (mean squared error) convergence, which is the most common loss considered in statistics.

LSK21
  • 900