11

Suppose you toss a (fair) coin 9 times, and get heads on all of them. Wouldn't the probability of getting a tails increase from 50/50 due to regression towards the mean?

I know that that shouldn't happen, as the tosses are independent event. However, it seems to go against the idea of "things evening out".

Yatharth Agarwal
  • 901
  • 6
  • 21
Joebevo
  • 1,439
  • 10
    Suppose we had $9$ heads in a row. It is quite likely that the next $91$ tosses will be much more balanced, so the proportion of heads in the combined $100$ tosses is likely to be quite a bit closer to parity. But that's because the $91$ tosses are likely to be fairly evenly split, not because of a catchup effect. It is quite likely (about $50$ percent) that the heads will be leading by more than $9$. But the percentage lead will essentially certainly have shrunk. Keep thinking thus: coins have no memory. – André Nicolas Jul 01 '13 at 07:23
  • 2
    Regression to the mean is being misused. The only regression is that the coin is likely not to give such weird results in the next bunch of tosses. It will likely give more or less even percentage splits in the next bunch of tosses. – André Nicolas Jul 01 '13 at 07:27
  • 1
    "The only regression is that the coin is likely not to give such weird results in the next bunch of tosses."...Ah. Got it! – Joebevo Jul 01 '13 at 08:09
  • 2
    If I got heads on the first nine tosses, I'd begin to suspect it wasn't a fair coin, and I'd say the tenth toss is more likely to be heads again than to be tails. – Gerry Myerson Jul 01 '13 at 11:06
  • 1
    @GerryMyerson: Interesting. That's how Bayesian reasoning works, doesn't it? – Joebevo Jul 01 '13 at 11:33
  • Yes. If you have a prior estimate of, say, the probability of getting heads, Bayes tells you how to update that estimate in the light of experience with the coin. – Gerry Myerson Jul 01 '13 at 12:24
  • 1
    Worth pointing out that 9 heads in a row is not highly unlikely $\frac{1}{512}$. At the two-up tables in the Sydney casino this has probably happened today. – Dale M Jul 01 '13 at 12:46
  • @YatharthROCK: I've rolled back to the old version, as I felt your edit was too subjective, and not really helpful. In my opinion, you have not respected the original author. Please do let me know, if you feel that I am wrong. Regards – Nils Matthes Mar 01 '14 at 09:51
  • @NilsMatthes No prob, although if you could shorten the description yourself it'd be nice. Just a question from a someone relatively new to SE: is it preferred for posts to be direct and to-the-point for the benefit of others rather than as originally written to be reflective of the OP? – Yatharth Agarwal Mar 01 '14 at 09:58
  • 1
    Dear @YatharthROCK: this is somewhat a fine line. The one extreme are users who write their questions like an exercise straight out of a textbook. The other extreme are users who are meandering about in their question. Both extremes are discouraged; you are expected to show some effort on the one hand, but you should make the question clear on the other. That being said, imho the OP was fine; the reflectiveness was indicative of effort, while the question was kept at reasonable length. – Nils Matthes Mar 01 '14 at 10:16
  • I'm flagging this question as a duplicate because the newer question (!) has attracted more attention and (consequently) better answers, even though this question is quite a bit older. (Bad luck--sorry!) – Kyle Strand Jul 18 '16 at 21:35
  • (In other words, to put extended numbers to Andre's thought... if you got heads 10 times in a row... the [slightly] most likely expectation is that if you flipped 100 more times, you'd end up with 60/110 heads total... with 59/110 and 61/110 only very slightly less likely, and so on, tapering off to the very unlikely 10/110 and 110/110 heads. 55/110 is as likely as 65/110. So the top expectation going forward is any past anomalous bump remains... but as a percent it's getting tinier [55/110 = 60% heads, whereas 10/10=100%... if you went on, 510/1010 heads = 50.5%, etc, until eventually noise]) – JeopardyTempest Sep 08 '18 at 15:06
  • But also note that streakiness is something you should EXPECT to see in truly random data. The odds of tossing a coin 10 times and getting exactly alternating HTHTHTHTHTHT (or THTHTHTHTHTH) is a tiny 1/256 ($9C{10} \cdot, \frac{1}{2^9}$). You should expect a streak of like 8 in a row or so within a set of 100 flips (the exact calculation of what streak is most expectable is quite doable, just eludes me at the moment) [sort of lines up with the Birthday Paradox]. It's easy for people to want to interpret finding a streak as seeing a pattern, but some streakiness is the expectation. – JeopardyTempest Sep 08 '18 at 15:18

3 Answers3

4

[TL;DR:] The key distinction to make, I think, is between the next event's theoretical probability v/s the cumulative empirical probability.

The Gambler's Fallacy of assuming the probability of the 10th toss being anything but exactly 50/50 is wrong (assuming the coin is fair).

However, since the probability should be 50/50, you are most likely to get 45 heads and 45 tails over the next 90 throws. So if the proportion of heads was 90% in the first 10 tosses, it will be ~54% over 100 tosses (including the ten before) — regression (moving closer) to the mean (here, 50%).

[Update:] Hadn't noticed it before, but @AndreNicholas got to it before me in the comments.

Yatharth Agarwal
  • 901
  • 6
  • 21
2

This is interesting because it shows how tricky the mind can be. I arrived at this web site after reading the book by Kahneman, "Thinking, Fast and Slow". I do not see contradiction between the gambler´s fallacy and regression towards the mean. According to the regression principle, the best prediction of the next measure of a random variable is the mean. This is precisely what is assumed when considering that each toss is an independent event; that is, the mean (0.5 probability) is the best prediction. This applies for the next event’s theoretical probability and there is no need for a next bunch of tosses. The reason we are inclined to think that after a “long” run of repeated outcomes the best prediction is other than the mean value of a random variable has to do with heuristics. According to Abelson's first law of statistics, "Chance is lumpy". I quote Abelson: "People generally fail to appreciate that occasional long runs of one or the other outcome are a natural feature of random sequences." Some studies have shown that persons are bad generators of random numbers. When asked to write down a series of chance outcomes, subjects tend to avoid long runs of either outcome. They write sequences that quickly alternate between outcomes. This is so because we expect random outcomes to be "representative" of the process that generates them (a problem related to heuristics). Therefore, assuming that the best prediction for a tenth toss in you example should be other than 0.5, is a consequence of what unconsciously (“fast thinking”) we want to be represented in our sample. Fool gamblers are bad samplers. Alfredo Hernandez

1

The Gambler's Fallacy is the incorrect belief that after a sequence of random events of one kind, the next event is more likely to be of an opposite or different kind. In the case of an equilibrium coin toss, odds of the next event are the same as the previous event, that is of equal chance. That is not inconsistent with our sense that things tend to even out over time as long as we appreciate that time is not defined as a single event but as a sequence.

A coin toss involves random chance the result of which cannot be determined but only described probabilistically as 50/50 per event and as something tending toward a mean of 50/50 in an indefinite sequence of such events.