-1

Given a table (database) with 100 rows (slots). The row in the table is selected randomly every time new data are written to the table. What is the probability that on 10th time, a collision will occur? Collision is an instance when data is written to the place that already contains previously written data.

yw k
  • 9
  • This reminds me of one of my earlier questions: http://math.stackexchange.com/questions/1428431/probability-of-picking-a-ball-that-has-already-previously-been-picked – Bobson Dugnutt Oct 25 '16 at 16:31
  • What is the range of the data ? – callculus42 Oct 25 '16 at 17:04
  • 1
    @callculus I would guess that OP means that you have a 100-dim vector, with zeroes in all entries before any data has been written. When an entry is chosen, write a one in it. Now, what is the probability that the 10th chosen entry already has a one? This is how I interpret it at least. – Bobson Dugnutt Oct 25 '16 at 17:37
  • @Lovsovs That is one of many other possiblities but nothing in the question indicates to that. The OP should clarify it. I have voted to close. – callculus42 Oct 25 '16 at 17:57
  • @callculus I don't know, but it was pretty clear to me. – Bobson Dugnutt Oct 25 '16 at 19:25
  • @Lovsovs Why ? There is no hint in the question to it. – callculus42 Oct 25 '16 at 19:29
  • @callculus Both William and I seemed to get it. I can't tell you why you didn't. No offense. Let's not take this discussion further. – Bobson Dugnutt Oct 25 '16 at 19:55

1 Answers1

0

I assume this question is being asked in reference to a hashing function? Such problems can also be thought of as the birthday problem according to here. This website also discusses the problem of calculating hash collision probabilities -- it uses a Poisson approximation to estimate the value although an exact expression can always be found, it just may be difficult to evaluate. I will add more links to solving the birthday problem and hopefully some will help you (1)(2)(3).

Feller Volume 1 has a discussion of the birthday problem, with examples and tables demonstrating the accuracy of the Poisson approximation to the actual value.

$$1 - \left(\frac{1}{100}\right)^{10} \left(100\cdot 99 \cdot 98 \cdot 97 \cdot 96 \cdot 95 \cdot 94 \cdot93 \cdot 92 \cdot91 \right) $$

EDIT: Ball and bin problems usually assume there are more balls than bins, rather than fewer, whereas the birthday problem usually assumes fewer bins than balls (i.e. fewer people than days in the year, so the birthday problem is probably better to look at here.)

I believe this can also be rephrased as a balls and bins problem, of this type:

Glorfindel
  • 3,955
Chill2Macht
  • 20,920