Do we really get extra freedom if one conditions on probability zero events?

Question

Just to make things clear, I'm not here claiming I broke probability theory. It is just that I got myself into a bad situation questioning if life is even worth it.

So here is a problem:

Problem: Let $(B_t)_{t\geq 0}$ be a standard BM and let us condition on $\{B_1=0\}$. Let $A\in \mathcal F_1$ (where $\mathcal F_t$ is the canonical filtration of $(B_t)_{t\geq 0}$). For example we can have something like $A=\{B_t\leq 1 $ for all $t\in [0,1]\}$. Find $\mathbb P(A\mid B_1=0)$.

I saw such problem in the book "A first course in stochastic processes" by Karlin and Taylor (exercise 6, p 386).

My solutions to the problem. I can give the simple answer "It is zero" i.e. $$\mathbb P(A\mid B_1=0)=0$$(actually say any number between 0 and 1). On the other hand, of course, I can do some calculations and provide an answer that is better accepted.

So now my question is:

My question: on what basis can one actually tell me my first answer, where I claim it is zero, is wrong?

My own thoughts:

We want to find a "nice" function $g$ for which $g(B_1)=\mathbb P(A|B_1)$ a.s., and then the answer is $g(0)$. But then we get the problem $\mathbb P(A|B_1)$ is not unique on null sets so we can find another $h$ for which $g(0)\neq h(0)$ and still $h(B_1)=\mathbb P(A\mid B_1)$ a.s..
That apparently is not strong enough to give us a unique answer for our original problem. Let's go for something stronger and say that we want a regular conditional probability $g(x,A)$ for which $g(B_1,A)=\mathbb P(A|B_1)$ a.s.. But in this case too, nothing stops me from making a new function $h(x,A)$ making it equal to $g(x,A)$ except at $x=0$, I make it whatever I want. And yes that new $h$ is also a regular conditional probability.
Is limits the only way to make this give us a unique answer? I mean that we condition on something like $U_{\varepsilon}:=\{B_1\in (-\varepsilon,\varepsilon)\}$. And then we consider the limit as $\varepsilon\to 0^+$ of $\mathbb P(A\mid U_\varepsilon)$. And that we take as a definition. I hate to say this, but if this is the case, does this always work for any type of process?
Something like Doob's $h$-transform maybe? I still have the feeling that this won't make it unique either.

I actually feel super flawed. I've seen this many times and never made a big deal out of it, but after I was solving a related problem I got this question where I was wondering who told me that any other answer is actually wrong? I could not prove it. Also I know that probabilist's work was not for nothing, so I'm sure there is a way to make $\mathbb P(A\mid B_1=0)$ so precise that we get only one correct right answer for the mentioned problem.

Could you provide context for where you saw $\mathbb{P}(A|B_1 = 0)$? Most rigorous texts condition on sigma algebras instead of events of measure $0$ for the reasons you laid out in this question. Even regular conditional distributions should only be defined almost surely.
Maybe it's being used informally the same way you might accidently mention value of the pdf at a point (which is similarly ill-defined)? That is, perhaps there's some easily representable (continuous?) function $f$ such that $f(x) = \mathbb{P}(A|B_1=x)$ a.s. and the question is implicitly asking for $f(0)$? — forgottenarrow, Nov 04 '20 at 03:41
@forgottenarrow I added where I found such problem. In the book they actually asked about maximum of brownian motion staying under a particular line given it is zero at 1. So more or less the problem I have up there with a particular choice of $A$. — Shashi, Nov 04 '20 at 08:29

user159517 · Answer 1 · 2021-12-15T10:05:45.670

I've explained some of the difficulties in conditioning on events of probability zero in my answer here. In essence, to pin down a notion of conditional probability on an event with probability zero, some additional input is required, such as a symmetry principle, or a partition of hypotheses with respect to which it is supposed to make sense. In your example, one might for example require that the law of total probability should hold with the choices you make for $\mathbb{P}(A|B_1=x)$ with $x \in \mathbb{R}$, which leads to the concept of disintegrations. However that pins down $ x \mapsto \mathbb{P}(A|B_1=x)$ only for almost every $x$, and you might still define it in any way you like at $x = 0$.

To rule out pathological choices, it is usually required that $ x \mapsto \mathbb{P}(A|B_1=x)$ be measurable for any $\mathcal{F}$-measurable event $A$ and that $A \mapsto \mathbb{P}(A|B_1=x)$ is a probability measure for any $x \in \mathbb{R}$, which leads precisely to the definition of a regular conditional distribution, which is defined only almost surely. When asked to evaluate a regular conditional distribution pointwise, common wisdom is to choose the value of a continuous representative (if it exists) at that point.

This is similar to the following question: what is the value of the density of a standard normal variable at $x = 0$? Since densities are only unique up to almost everywhere equivalence, this is actually an ill-posed question, and any value in $\mathbb{R}$ is a technically 'correct' answer. However, I believe that most people would interpret this question as 'what is the value of the continuous representative of the density of a standard normal variable at $x=0$?', which is a well-posed question with the unique answer of $1/\sqrt{2\pi}$. In your question, the same approach yields a unique answer, so perhaps the authors really meant to ask 'what is the value of the continuous representative of $x \mapsto \mathbb{P}(A|B_1=x)$ at $x=0$?'.

Regarding defining conditional distributions through limits as you propose: this is a natural idea, but note that this leads to an irregular conditional distribution. To see this, let's define $\mu(A) := \lim_{\epsilon \to 0} \mathbb{P}(A|U_{\epsilon})$. Then, for any $\delta > 0$, we have $\mu(U_{\delta}) = 1$, but $\mu(\{B_1 = 0\}) = 0$. Since the sets $U_\delta$ are decreasing in $\delta$ and $\bigcap_{\delta > 0} U_{\delta} = \{B_1 =0\}$, this contradicts continuity from above, hence $\mu$ is not a probability measure.

Thank you very much! The last part is illuminating! Also, I may not totally agree with the part with mathematicians saying the value at $x=0$. Maybe reformulate it in such a way: they choose the continuous density instead of another that is discontinuous at 0, say... Maybe you meant that? Anyways I get the point! — Shashi, Dec 15 '21 at 06:23
To clarify my point more, there are people who are super aware of uniqueness, whenever there is no uniqueness they'll hit you with "your question does not have a unique answer" if you know what I mean. I'm one of those people — Shashi, Dec 15 '21 at 07:12
@Shashi fair enough, that was not the best formulation, I changed it. To some extent I am the same, however if there is something which in my view is a canonical choice I often assume that this is meant and see where that leads. — user159517, Dec 15 '21 at 10:16

Do we really get extra freedom if one conditions on probability zero events?

1 Answers1