Let $B(\omega,t)$ be a Brownian motion defined on some appropriately filtered probability space $(\Omega,\mathcal{F}_{t},\mathbb{P})$, and let $f(\omega,t)$ be a stochastic process defined on $\Omega$ and adapted to $\mathcal{F}_{t}$.
The stochastic integral of $f$ with respect to $B$ is defined as (over some partition $\Pi_{n}$ of $[0,t]$ such that $\|\Pi_{n}\|\to0$ as $n\to\infty$) $$\underbrace{\int_{0}^{t}f(\omega,s)\;dB(\omega,s)}_{I(\omega,t)}:=\lim_{n\to\infty}\underbrace{\sum_{j=0}^{n-1}f(\omega,t_{j})[B(\omega,t_{j+1})-B(\omega,t_{j})]}_{I_{n}(\omega,t)},$$ where the limit is taken in $L^{2}$, $L^{1}$, probability, etc. but not pointwise (pathwise) with respect to $\omega$.
In other words, $$\|I_{n}^{t}-I^{t}\|_{L^{2}(\Omega)}\to0\;\text{as}\;n\to\infty$$ or $$\lim_{n\to\infty}\mathbb{P}\left(\omega:|I_{n}^{t}(\omega)-I^{t}(\omega)|\geq\epsilon\right)=0\;\text{for all}\;\epsilon>0.$$
Since $t$ is held fixed throughout the entire limiting process, consider just a sequence of random variables $\{X_{n}(\omega)\}$. Intuitively, the difference between convergence in probability and convergence pointwise is that while the probability of the set on which the $X_{n}$ and $X$ disagree goes to $0$ for both, convergence in probability allows this set to move around in the domain $\Omega$, whereas pointwise convergence requires the set to remain fixed for all variables in the sequence past some finite point. In other words, the total probability of all the events for which $|X_{n}(\omega)-X(\omega)|>\epsilon$ must go to $0$ for both modes of convergence, but in convergence in probability the location of the events within the sample space can change (as long as the total probability is shrinking to $0$), whereas for point wise convergence they must remain fixed past some finite point (and also shrink to $0$).
Is there a way to understand this concretely for the case of the stochastic integral? For instance, what is the nature of the "shuffling" around the sample space that causes pathwise convergence to fail, and the shrinking of probability of "bad" events to $0$ that still allows for convergence in probability? I'm willing to accept an answer for the same situation as regards $[B,B](\omega,t)=t$, which is true only when the convergence of the approximating sequence is taken in probability.
The main motivation for asking this question is to help dispel the lack of intuition behind calculations involving stochastic integrals (e.g. Ito's lemma), since you'd like to think of them as taking place pointwise, but in fact they are taking place after convergence in probability.
EDIT Added some comments to perhaps narrow the question being asked.
Since convergence in probability implies a.s. convergence of a subsequence, if we take a rapidly decreasing sequence of partitions $\Pi_{n}$, we recover pathwise convergence. Is it possible to illustrate two explicit sequences of partitions whereby one results in convergence along both modes and the other just in probability? Furthermore, does the usual $t/n$ partition decrease rapidly enough to achieve a.s. convergence?
– Sargera Dec 21 '14 at 02:41