9

What is the probability distribution of solving a block, given the same difficulty.

So if I try to mine multiple times using the same difficulty, is it normal distribution with mean of 10 minutes? What is the variance? Or is it some other distribution? Or maybe even deterministic chaos?

Murch
  • 75,206
  • 34
  • 186
  • 622
jaybny
  • 173
  • 1
  • 10
  • It is a poisson distribution. Every attempt to form a block has the same probability of success and the number of attempts per second is roughly constant. – David Schwartz May 12 '14 at 23:47
  • 3
    @DavidSchwartz: No, it's a Poisson process. Which means that the number of blocks in an interval follows the Poisson distribution - but the time between blocks, which the OP asked about (I think), follows the exponential distribution. – Meni Rosenfeld May 13 '14 at 09:14
  • 2
    @MeniRosenfeld The question is sufficiently vague that it could be asking either about the expected number of blocks found or about the expected time between blocks found. They follow two different distributions. – David Schwartz May 13 '14 at 17:34

3 Answers3

12

The time between consecutive blocks follows the exponential distribution, with mean (roughly) 10 minutes. This means that the variance is 100 minutes^2.

Meni Rosenfeld
  • 19,700
  • 37
  • 70
  • so standard deviation is 10? So its just as likely to take 1 minute vs 19 minutes? – jaybny May 12 '14 at 18:43
  • http://bitcoin.stackexchange.com/questions/4690/what-is-the-standard-deviation-of-block-generation-times – jaybny May 12 '14 at 18:58
  • @jaybny: Yes, the standard deviation is 10 - however, the distribution is not symmetric, so 1 minute and 19 minutes have different probability densities. – Meni Rosenfeld May 13 '14 at 09:12
  • im now asking if the probability that it takes exactly 1 minute or exactly 19 minutes the same? since they are both the same STD from mean? – jaybny May 13 '14 at 13:39
  • @jaybny: This is a continuous variable so the probability that it is exactly 1 minute is 0. That's not what you wanted. You meant to ask about the probability densities. And I already answered negatively. The distribution is not symmetric so being 1 sd from the mean does not mean it's the same density. You can see a graph of the density in the linked Wikipedia article. – Meni Rosenfeld May 13 '14 at 20:58
  • I am not sure why this should follow exponential distribution. The hashing function that bitcoin follows uniform distribution ie .. all hashes are equally likely to occur. So assuming that hash rate is constant between two blocks, it is equally likely that you get a block in 0 - 20 min. Can you explain why u say its exponential distribution ? – dark knight Jul 29 '15 at 17:30
  • im assuming it has to do with real world factors, like mining rigs online/offline, and human operators of the mining strategy – jaybny Jul 29 '15 at 18:54
  • If you factor increase of hashrate during difficulty period, the average will be much less than 10 min. Its fair to assume same hashrate , the next difficulty is also calculated assuming uniform probability distribution https://en.bitcoin.it/wiki/Difficulty – dark knight Jul 30 '15 at 04:04
  • read up on poisson processes. maybe meni has answers – jaybny Jul 30 '15 at 06:23
  • @darkknight: I didn't see your comment when you first wrote it. In case it's still relevant: Unlike jaybny's suggestion this has nothing to do with real-world factors, I'm assuming the most ideal model possible, including that the difficulty and hashrate is constant. The hash of each attempt is uniformly distributed but that's irrelevant - the only thing that matters is if it's below the target or not. That's a Bernoulli trial with (continued...) – Meni Rosenfeld Mar 15 '16 at 14:19
  • 1
    (...continuing from previous comment) some probability. These trials are all independent. So the time to find a block can be arbitrarily long (because I can just be unlucky and fail again and again and again). But the probability decreases exponentially: The probability to find a block in the 4th minute is the probability to fail in the 1st and fail in the 2nd and fail in the 3rd and succeed in the 4th. Because of the extra conditions, this has less probability than just succeeding in the 1st minute. This results in the https://en.wikipedia.org/wiki/Exponential_distribution. – Meni Rosenfeld Mar 15 '16 at 14:21
  • @MeniRosenfeld Thanks for explaining. So get the part that its exponential distribution. Minor question is had is that in the 4th minute is it not more probable that we find a block. I am thinking we already covered lots of hashes in the first 3 minutes(versus a coin toss its always 0.5 each time). – dark knight Dec 14 '17 at 12:34
  • @darkknight: The number of hashes you can find is a tiny portion of the space of all hashes (something like 0.0000000000000000000000000000000000000000000000000000000001%), and an even tinier portion of the space of all possible inputs. There's not even a guarantee that there is any solution. Each hash attempt is its own independent coin toss. – Meni Rosenfeld Dec 14 '17 at 12:59
11

The expected time (mean) for a new block is of course 10 minutes, assuming constant hashrate, and no block propagation time.

The tricky part is that there is no such thing as a point in time. You can only ask only for an interval.

Let's illustrate this. First it is important to not fall for the Gambler's fallacy. Luck has no "memory". Thus if no block has been found (or if a block has just been found), what is the chance of a block being found in the next minute? The easy answer would be 1/10 = 10%. Or in the next second? 1/600 = 0.16667%. But this is not quite true.

If you ask how often a block is found before or after 10 minutes, you are asking for the cumulative distribution function (CDF) of the exponential distribution.

We can use Wolfram Alpha to plot this:

cdf exponential distribution λ=1/600

How many blocks are found after the mean (600 seconds)?

exp(-600/600) = exp(-1) ~= 36.788%

How many blocks are found before the mean?

1-exp(-1) ~= 63.212%

How many blocks are separated by more than 1, 2, 5, 10, 20, 30, 60 minutes?

exp( -1/10) ~= 90.484%
exp( -2/10) ~= 81.873%
exp( -5/10) ~= 60.653%
exp(-10/10) ~= 36.788%
exp(-20/10) ~= 13.534%
exp(-30/10) ~=  4.979%
exp(-60/10) ~=  0.248%

So what is the chance of finding a block in the next second/minute?

1-exp(-1/600)  ~= 0.16653%
1−exp(−60/600) ~= 9.516%

Bonus: How many blocks are found between 5 minutes an 20 minutes?

exp(−5/10)−exp(−20/10) ~= 47.120%
Felix Weis
  • 339
  • 3
  • 10
1

For the impatient

What is the probability distribution of solving a block, given the same difficulty.

It is not clear if you mean the probability distribution of the time it takes to mine a block or you mean something else. In any case I am going to describe some random variables and their probability distributions, including time.

So if I try to mine multiple times using the same difficulty, is it normal distribution with mean of 10 minutes?

The time it takes to mine a single block is a random variable distributed according to an exponentially decaying distribution function. The mean value depends on the hashrate of your miner and the current difficulty.

<t> = w1*d/h

where w1=4.2e9 is a constant. For example, if you have a 100PH/s mining facility at the current difficulty d=5.7e13, you're expected to mine a block after 28 days (if difficulty remains constant).

Mining a block in a single trial

The probability of mining a block with a single hash trial is p = T/2^{256}, where T is the target. The target is determined from the current difficulty d as

T = T1/d

where T1 is the target of the genesis block. The target is encoded in the blocks header under the field "bits". For instance T1 is encoded as 0x1d00ffffff which means

T1 = 0xffff * 256^{0x1d - 3} 

which is approximately equal to 2.7e+67.

This situation can be described with a random variable that takes values either 0 (block not found) or 1 (block found), which formally follows a Bernoulli distribution.

Mining a block in a sequence of trials

We can define a random variable w that represents the number of blocks candidates (or trials) until a valid block is found. The probability distribution of w is formally known as a Geometric distribution.

P(w|N=1) = p*(1-p)^{w-1}

This is understood as a multiplication of probabilities of w-1 failures (probability 1-p) and one success (with probability p).

This is useful for example in order to compute the expected work needed to mine a block; ie. the expected number of trials before one hits a valid block.

<w> = 1/p

Notice that, by the way we have defined the difficulty (above) we could write

<w> = w1 * d

where w1=2^{256}/T1=0x100010001 is the expected work needed to mine the genesis block.

In order to put time into the equation we consider the fact that a miner is a computer that roughly produces w = t*h block candidates (trials) in a given period of time t, where h is its hashrate.

Therefore the expected time until a block is found is

<t> = <w>/h = w1*d/h

This time increases with the difficulty and it decreases with the hashrate of the miner.

We can construct a probability density function (PDF) for the time variable as well in this case. Just consider that P(w|N=1) = p(t|N=1) dt where p(t|N=1) is the PDF for t and dt is the smallest unit of time dt = 1/h (one single hash), then if one uses the approximation (1-p) = exp(-p) it turns out that

p(t|N=1) = P(w|N=1)*h = h/(w1*d) * exp(-h*t/(w1*d))

This means the waiting time is distributed according to an Exponential probability density function, just like radioactive decay.

Mining N blocks after a sequence of trials

We can generalize the previous result in a situation in which we will continue making trials until we find N valid blocks. The number of trials w in this case is a random variable distributed according to a Negative Binomial distribution:

P(w|N) = (w-1)!/(N-1)!/(w-N)! p^N (1-p)^{w-N}

It follows that the expected (mean) work required to mine N blocks is

<w> = N/p

We can turn this into a PDF for t, but first let's do the following approximations

  • p<<1 hence (1-p) = exp(-p),
  • consider the case in which N is not a very large number, ie. N<<1/p and N<<w.

And just like in the previous case with N=1 we will need to normalize the PDF of t by dividing P(w|N) by dt = 1/h:

p(t|N) = P(w|N)*h = h/(w1*d) 1/(N-1)! (t*h/(w1*d))^{N-1} * exp(-t*h/(w1*d))

Which corresponds to a Gamma probability distribution function; in the case N=1 we obtain the exponential PDF.

The mean waiting time until we hit N blocks is then

<t> = N*w1*d/h

Counting valid blocks in a sequence of trials

If one tests w different hashes candidates then the probability distribution of the number N of valid blocks is a Binomial distribution. In other words the probability of finding N valid blocks is

P(N | w) = w!/N!/(w-N)! p^{N} (1-p)^{w-N}

From here it follows that after w trials the expected number of mined blocks is:

<N> = w*p = h*t/(w1*d)

In the limit when p<<1 and for N<<w the previous probability of finding N blocks after w trials approximates a Poisson distribution:

P(N|w) = (wp)^N/N! exp(-wp)

and if one introduces time into the equation (substitute w = t*h and p=1/(w1*d)) you can obtain

P(N|t) = (h*t/(w1*d))^N / N! exp(-h*t/(w1*d))

Which is the probability of obtaining N valid blocks after a period of time t. This formula again reminds us of radioactive processes, where the N would be number counts in a instrument like a Geiger.

At least one block in a sequence of trials

If you want to know the probability of mining at least a single block after w trials, let's denote this P(N>0|w) that would be 1 minus the probability of mining none after w trials:

P(N>0|w) = 1 - P(N=0|w) = 1 - (1-p)^{w}

It follows that the probability of mining at least a block in a time frame t = w/h is

P(N>0|t) = 1 - (1-p)^{h*t}

p is a very small number, we have seen above that one can express it in terms of the difficulty d as:

p = 1/(w1 * d)

where W1=0x100010001, for instance for d=1, pis around 2e-10. This means the previous formula for the probability can be written, using some fundamental limit, as

P(N>0|t) = 1 - exp(-p*h*t)

or in terms of the difficulty

P(N>0|t) = 1 - exp(-h*t/(W1*d))
Lagrang3
  • 130
  • 6