0

For example, if I let $X$ be the weight of a dog, then I weigh my dog and his weight is 10 lb.

Theorem says the probability of observing this 10 lb is zero, i.e $P(X=10) = 0$.

However, I DO observe this concrete weight from my real dog, so probability should be positive.

I can't wrap my head around this idea.

Can you explain why we observe some realizations of a continuous RV when theorem says that its probability should be 0?

Thanks!

(Some people say due to rounding, so the weight may be 10.000023000005000010... lb, it is ok with me if we use this number, but again how can we observe this 10.000023000005000010... number if the probability is 0?)

  • Thanks to the wonders of measure theory as applied to probability, there is no contradiction. – Henry Nov 16 '15 at 00:44
  • 1
    Note that a continuous probability model is precisely that, a mathematical model. In the real world, there is probably no such thing as the instantaneous weight of your dog at a certain time. But a mathematical model need not be right in order to be useful. – André Nicolas Nov 16 '15 at 00:48
  • 1
    Related: http://math.stackexchange.com/questions/920241/can-an-observed-event-in-fact-be-of-zero-probability?rq=1 What is going on is that the event that your dog's weight is exactly 10 is a "Null Set"(a set with measure zero). I.e. that the probability of picking a dog at random and its weight is exactly 10 is less than $\epsilon$ for all $\epsilon>0$. Equivalently, picking a dog and its weight being something other than exactly 10 would be greater than $1-\epsilon$ for all $\epsilon>0$. Worded differently, it is almost always not $10$. – JMoravitz Nov 16 '15 at 00:54

3 Answers3

2

First of all, there exist two main types of random variables: the discrete type and the absolutely continuous type. For example, if you ask someone to give you an integer between $1$ and $10$, there are only $10$ possible values for $X$. This is a discrete random variable.

Now, ask this person to choose a real number between $1$ and $10$: there are an uncountable number of possible values for $X$ in this interval. This is an "absolutely continuous random variable".

Here, you consider $X$ as an absolutely continuous random variable because you consider that every real positive value (acceptable for the weight of a dog) is possible. In the sequel, I will not be fully rigorous, but it can give you an idea why we say that $\mathbb{P}[X=x]=0$. I give you an intuitive explanation and a more "rigorous" one.

Intituitively:

Consider that the acceptable values for the weight of a dog are in $[5,15]$ and that your dog has the same probability to have a particular weight $a$ than any other weight in this interval. The probability that your dog weighs a very particular value is $1$ over the number of reals in $[5,15]$, which is clearly infinite! And "$1$ over $\infty$ is $0$". It doesn't mean it will never happen that your dog weighs a particular weight $a$, but if you pick a random number in $[5,15]$, it will "almost always" not be the actual weight of your dog.

I suppose you know what is an integral. In the absolutely continuous case, a random variable $X$ may admit a density $f_{X}$ and, as you probably know, we have

$$\mathbb{P}[a\le X\le b]=\int_{a}^{b}f_{X}(x)\text{d}x$$

Which denotes the probability that $X$ takes values between $a$ and $b$. But if you take $a=b$, which corresponds to $\mathbb{P}[X=a]$, you have

$$\mathbb{P}[a\le X \le a]=\int_{a}^{a}f_{X}(x)\text{d}x=0=\mathbb{P}[X=a]$$

"Formally":

This part is not really rigorous but it gives you an insight on the idea behind the formalism of probability theory. Be aware that I'm not complete and that you should not rely on this for rigorous things.

The axiomatic of probability theory is based on measure theory. In measure theory, we define some positive functions on particular subsets of a set $\Omega$, which are called measures. They have to fullfill some properties, but I won't talk about them now.

A very particular measure we need here is the so-called Lebesgue measure $\mu$. To stay simple, we will restrict to $\Omega\subset\mathbb{R}$ the measure of an interval $(a,b)\subset\Omega$ is $\mu(a,b)=b-a$ (where $b\geq a$). You can intuitively see that the measure of $[a,a]=\{a\}$ is $0$.

A property often required for a measure is to ask for "a certain additivity", which means that $\mu(A\cup B)=\mu(A)+\mu(B)$ when $A\cap B=\emptyset$. This directly implies $$\mu(a,b)=\mu[a,b)=\mu(a,b]=\mu[a,b]=b-a$$ because $(a,b)\cup\{b\}=(a,b]$ and $(a,b)\cap\{b\}=\emptyset$.

Actually, we generally want $\sigma$-additivity or "countable additivity", which refers to $$\mu(\cup_{i=1}^{\infty}A_{i})=\sum_{i=1}^{\infty}\mu(A_{i})$$ where the $A_{i}$'s are pairwise disjoint. You can't take an uncountable number of $A_{i}$'s!

In the absolutely continuous case, like yours, the probability measure comes from this Lebesgue measure in a way I won't explain here. While this Lebesgue measure is not bounded (think about $\mu[0,\infty)$), we consider the probability measure as bounded and, more precisely, we consider that $\mu(\Omega)=1$ where $\Omega$ is the subset of all possible values for your random variable.

Moreover, the probability measure (in the absolutely continuous case) $\mathbb{P}$ is said dominated by $\mu$, which means that, whenever $\mu(A)=0$ for some set $A$, it implies $\mathbb{P}(A)=0$. Here, your event $A$ is $X=x$, and it corresponds to $\{x\}$ and we have previously seen that $\mu(\{x\})=0$, so that $\mathbb{P}[X=x]=0$.

2

Consider a real-valued random variable uniformly distributed between zero and one.   Such a number could theoretically be generated by rolling a ten sided die an infinite number of times to list the decimals.

  What is the probability that the result is between $0.314$ and $0.315$?   Well, clearly $1/1000$.

  But what is the probability that the result is exactly $\pi/10$?   Well, almost certainly zero.

The more precision you demand, the more unlikely the real-valued result will fall within that range, and if you demand infinite precision its almost certainly not going to happen.

We say such an event is almost impossible because, while it is not actually impossible, the probability is so infintesmally small that we can't measure the difference.

Graham Kemp
  • 129,094
1

The fact that a specific event has probability $0$ does not mean that the event is impossible. Only that it's unlikely. Take, for instance, coin flipping. Let's say you flip a coin repeatedly until you get a heads, and at that point you stop. The probability that you never get to stop (i.e. only get tails) is $0$, but it is not impossible.

In fact, your dog hitting a (mathematically exact) weight has exactly the same probability as an infinite sequence of coin tosses has of following a given, inifnite sequence of heads and tails (in other words: there are as many real numbers as there are infinite sequences consisting of $H$ and $T$).

The moral is that there are so many events to choose from that if every event had higher than $0$ probability of ocurring, then the total probability would be infinite.

Arthur
  • 199,419
  • I always thought that probability 0 means that an event is impossible to occur, maybe this is where I got confused.

    By the way, I'm not trying to discredit you or anything, but is this: "Probability 0 does not mean that an event is impossible. Only that it's unlikely" just your opinion or is it from some sources like books,...

    – user262959 Nov 16 '15 at 01:20
  • 2
    @user262959 In a discrete finite case, yes, a probability $0$ event means it is impossible to occur. In a continuous case, not exactly. And it is not his opinion: it is an intuitive meaning of the theory. Formally, we say that the particular event "your dog has the weight $a$" "almost never occurs" and it is an interesting terminology since it gives you an intuitive idea of why it is $0$ but not impossible. – MoebiusCorzer Nov 16 '15 at 01:35
  • 1
    @MoebiusCorzer Not even in a discrete, finite case. Say the discrete, finite variable which is $1$ if the dog weights more than $10$, zero if the dog weighs less than $10$ and $1/2$ if the dog weighs exactly $10$. Then it will be close to a Bernoulli distribution, but not quite the same, since $0.5$ occurs almost never, but not never. – Arthur Nov 16 '15 at 01:42
  • 1
    @Arthur Ok I get It. Nice example, I wrongly thought about the $X:(\Omega,\mathcal{A})\to (\mathbb{R},\mathcal{B})$ where $\Omega$ was finite instead of $X(\Omega)$ finite. – MoebiusCorzer Nov 16 '15 at 02:52