45

I attempted to answer this question on Quora, and was told that I am thinking about the problem incorrectly. The question was:

Two distinct real numbers between 0 and 1 are written on two sheets of paper. You have to select one of the sheets randomly and declare whether the number you see is the biggest or smallest of the two. How can one expect to be correct more than half the times you play this game?

My answer was that it was impossible, as the probability should always be 50% for the following reason:

You can't! Here's why:

The set of real numbers between (0, 1) is known as an Uncountably Infinite Set (https://en.wikipedia.org/wiki/Uncountable_set). A set that is uncountable has the following interesting property:

Let $\mathbb{S}$ be an uncountably infinite set. Let, $a, b, c, d \in \mathbb{S} (a \neq b, c \neq d)$. If $x$ is an uncountably infinite subset of $\mathbb{S}$, containing all elements in $\mathbb{S}$ on the interval $(a, b)$; and $y$ is another uncountably infinite subset of $\mathbb{S}$, which contains all elements of $\mathbb{S}$ on the interval $(c, d),$ $x$ and $y$ have the same cardinality (size)!

So for example, the set of all real numbers between (0, 1) is actually the exact same size as the set of all real numbers between (0, 2)! It is also the same size as the set of all real numbers between (0, 0.00001). In fact, if you have an uncountably infinite set on the interval $(a, b)$, and $a<n<b$, then then exactly 50% of the numbers in the set are greater than $n$, and 50% are less than $n$, no matter what you choose for $n$. This is important because it tells us something unintuitive about our probability in this case. Let's say the first number you picked is 0.03. You might think "Well, 97% of the other possible numbers are larger than this, so the other number is probably larger." You would be wrong! There are actually exactly as many numbers between (0, 0.03) as there are between (0.03, 1). Even if you picked 0.03, half of the other possible numbers are smaller than it, and half of the other possible numbers are larger than it. This means there is still a 50% probability that the other number is larger, and a 50% probability that it is smaller!

"But how can that be?" you ask, "why isn't $\frac{a-b}{2}$ the midpoint?"

The real question is, why is it that we believe that $\frac{a-b}{2}$ is the midpoint to begin with? The reason is probably the following: it seems to make the most sense for discrete (finite/countably infinite) sets. For example, if instead of the real numbers, we took the set of all multiples of $0.001$ on the interval $[0, 1]$. Now it makes sense to say that 0.5 is the midpoint, as we know that the number of numbers below 0.5 is equal to the number of numbers above 0.5. If we were to try to say that the midpoint is 0.4, we would find that there are now more numbers above 0.4 then there are below 0.4. This no longer applies when talking about the set of all real numbers $\mathbb{R}$. Strangely enough, we can no longer talk about having a midpoint in $\mathbb{R}$, because every number in $\mathbb{R}$ could be considered a midpoint. For any point in $\mathbb{R}$, the numbers above it and the numbers below it always have the same cardinality.

See the Wikipedia article on Cardinality of the continuum (https://en.wikipedia.org/wiki/Cardinality_of_the_continuum).

My question is, from a mathematical point of view, is this correct? The person who told me that this is wrong is fairly well known, and not someone who I would assume to often be wrong, especially for these types of problems.

The reasoning given for my answer being wrong was as follows:

Your conclusion is not correct.
You're right that the set of real numbers between 0 and 1 is uncountable infinite, and most of what you said here is correct. But that last part is incorrect. If you picked a random real number between 0 and 1, the number does have a 97% chance of being above 0.03. Let's look at this another way. Let K = {all integers divisible by 125423423}. Let M = {all integers not divisible by 125423423}. K and M are the same size, right? Does this mean, if you picked an random integer, it has a 50% chance of being in K and a 50% chance or not? A random integer has a 50% chance of being divisible by 125423423?

The reason I disagreed with this response was because the last sentence should actually be true. If the set of all numbers that are divisible by 125423423 is the same size as the set of numbers that aren't, there should be a 50% probability of picking a random number from the first set, and a 50% chance that a number would be picked from the second. This is cirtainly the case with finite sets. If there are 2 disjoint, finite sets with equal cardinality, and you choose a random number from the union of the two sets, there should be a 50% chance that the number came from the first set, and a 50% chance that the number came from the second set. Can this idea be generalized for infinite sets of equal cardinality?

Is my answer wrong? If so, am I missing something about how cardinalities of two set relate to the probability of choosing a number from one of them? Where did I go wrong in my logic?

Ephraim
  • 1,878
  • You are missing a Jacobian in the transformation of (continuous) variables. Density of $U(0,1)$ is twice that of $U(0,2)$ - so while each particular $x$ has zero probability in both cases, these zeros are of different orders - one is twice as large as the other. – A.S. Dec 24 '15 at 00:35
  • 4
    http://math.stackexchange.com/questions/655972/help-rules-of-a-game-whose-details-i-dont-remember/656426#656426 describes a strategy that will win strictly more than half the time. – MJD Dec 24 '15 at 00:36
  • 6
    Your reasoning about probability using cardinality leads to a paradox. If you select a number uniformly from (0,1], then you are just as likely to select a number from (1/2, 1] as (1/4, 1/2] as (1/8,1/4] as (1/16, 1/8], ..., and so on. So what is the probability of each of these events? – panofsteel Dec 24 '15 at 00:42
  • 22
    Cardinality is a red herring. If the number I see is "0.999", I have amazing confidence I'm looking at the larger of the two numbers. – Eric Towers Dec 24 '15 at 01:26
  • @EricTowers Your confidence assumes an unskilled player. See my answer. – Eugene Ryabtsev Dec 24 '15 at 05:08
  • 6
    You cannot mix and match Discrete Probability Distribution with Continuous Probability Distribution. Let me ask you these two questions:
    1. What is the probability that a randomly chosen number between 0 and 1 is exactly 0.278?
    2. What is the probability that a randomly chosen number between 0 and 1 lies in the range 0.45 to 0.60?
    – Masked Man Dec 24 '15 at 06:07
  • 6
    Your title asks a different question from that you quote. Wording makes all the difference in statistics problems. This may be part of the reason you're confused. – Carl Witthoft Dec 24 '15 at 12:04
  • 2
    This question must be either edited, or deleted. The title and the statement of the question in the body are totally unrelated. (Do not mention that additionally, the very extended discussion is possibly again unrelated to either of those.) The following should be done: (1) eliminate the extended discussion as it only adds confusion (2) state the problem itself clearly (3) change the title to express the problem (or just make it a general title like "Problem regarding random numbers"). – Fattie Dec 24 '15 at 14:58
  • 1
    @JoeBlow : Agree. Title claims numbers are chosen at random. Problem statement does not. Answers are all over the range of interpretations on this axis of ambiguity. Other defects are present as well. – Eric Towers Dec 24 '15 at 21:20
  • 1
    I also agree that the title is disjoint from the body of the question, and needs to be edited. – Daniel R. Collins Dec 24 '15 at 21:39
  • Assuming a fixed font and paper size, there is only a finite number of real numbers that can be expressed on it. An extraordinarily large set but finite, thus the assumption the number is randomly chosen from the entire set of real numbers is false. 2. Being an uncountable infinite set, the is no method by which to randomly pick a single member out of that set, therefore the assumption that the real number is randomly chosen from the entire set of real numbers is false.
  • – WorBlux Dec 25 '15 at 15:51
  • @WorBlux What part of the question assumes a fixed font and paper size? – David Richerby Dec 25 '15 at 17:32
  • @WorBlux - you are perfectly correct – Fattie Dec 26 '15 at 18:43