Finding the probability of an event governed by two random processes

Question

There is a game about coin collection. There are 5 coins with values 2,3,4,5,6. The player is given at max 6 chances to draw, each draw could result in one of the coins at the probability of 0.13, 0.18, 0.4, 0.14, 0.15 (possible to draw 2 or more coins of the same value). Each time the player reveals a coin, there is a chance to win that coin in the probability of 1/COIN VALUE. For example, if the player gets a COIN of value 3, there is 1/3 chance to win that coin. Each time the player collects a coin, the judge will check if the total value is more than or equal to 12, if yes, the player wins the game, otherwise, repeat it until 6 chances are used.

I am looking for a way to estimate the probability to win the game. I simply the problem as

find the average value of a coin to be revealed on each draw, which is $$V = 2*0.13+3*0.18+4*0.40+5*0.14+6*0.15 = 4$$
the probability of winning the average coin (value=4) will be $P=1/4$
the criterion expects a value of 12 to be collected and thus average $12/V=3$ draws will be.
each draw is a Bernoulli process and thus the probability to win the game is the same as the probability of getting 3 or more success in 6 trails or $$ P(6,3) + P(6,4) + P(6,5) + P(6,6) $$ where $P(n,m)$ means the probability of exact m success out of n trials. I compute it and find out there is 16.94% to win the game.

However, I check the result with a piece of code and obtain a very different answer. Here is the pseudo-code

X <- 0 // how many time to win the game
for n -> 1 to 100000000 {
    T <- 0 // total coin value per set of draws
  for m -> 1 to 6 {
    C <- randomly pick a coin with the weight
    Q <- randomly pick the success probability
    P <- randomly pick from (0, 1) in uniform distribution
    if P<Q { // means collect the coin
      T <- T + C
    } 
    if T>12 {
      X <- X+1
      stop this game
    }
  }
}
print X/C

here X/C should be the winning percentage and should be close to the probability of winning the game. But the program turns out that X/C is about 11% instead of 16.94%. It took me the whole day to check the code and the math, I don't see how it get wrong. Any clue to solve this problem is appreciated.

There is a probability distribution, and while the expected value of a coin drawn is 4, the weighted average based on probabilities, why assume the probability of winning that average four would be one fourth? That is given for the individual coin but not for a weighted average of the distribution of coins. So, what is the chance of drawing and winning a two? What is the chance of drawing and winning? (What is the chance of drawing and losing?) Then, based on these answers, what is the average win given a coin is won, and what percentage of draws are won? This is a check so far … — Gwendolyn Anderson, Sep 08 '21 at 06:45
Can you create some examples of six-coin selection patterns that sum to at least $12$? What about some examples of losing patterns? Can you calculate the probabilities of the example patterns? Do you know how to arrive at all possible patterns of six-coin selections? Or all winning/losing patterns? A probability distribution has a mean and a spread, and distributions with equivalent means will not give the same results if the spreads differ. In your code, it seems that $Q$ would not be random but conditional to $C$. Also, is $X/C$ a typo? Did you mean $X/T$? — Gwendolyn Anderson, Sep 08 '21 at 17:20

score 2 · Accepted Answer · answered Sep 08 '21 at 13:35

The following approach uses a generating function. Readers unfamiliar with generating functions may find many resources in the answers to this question: How can I learn about generating functions? I also used a computer algebra system to avoid a tedious computation, although this is not essential.

The probability of rolling a value of $n$ and being allowed to keep it is $$0.13/2, 0.18/3, 0.4/4, 0.14/5, 0.15/6$$ for $n=2,3,4,5,6$, so the probability of rolling the die and not being allowed to keep its value (so the effective score on a roll is zero) is $$1 - (.13/2+ .18/3+ .4/4+ .14/5+ .15/6) = 0.722$$ We define the probability generating function of the effective score on a roll as $$f(x) = \sum_{n=0}^6 p_n x^n$$ where $p_n$ is the probability of scoring $n$ on a roll. So $$f(x) = 0.722 +0.065 x^2+0.06 x^3+0.1 x^4+0.028 x^5+0.025 x^6$$ (We have written $0.065$ for $0.13/2$, etc. The coefficient of $x^1$ is zero because it is not possible to roll a one.)

Given $f(x)$, the probability generating function of the sum of $6$ rolls is $f(x)^6$. I used a computer algebra system (Mathematica) to expand $f(x)^6$, with result $$f(x)^6 = 0.141652\, +0.0765157 x^2+0.0706299 x^3+0.134938 x^4 \\ +0.0647538 x^5+0.0991588 x^6+0.069474 x^7+0.082668 x^8 \\+0.0574808 x^9 +0.0552784 x^{10}+0.0372368 x^{11}+0.0334262 x^{12} \\+0.0230628 x^{13}+ \dots + 2.44141 \times 10^{-10} x^{36}$$ so $0.141652$ is the probability of scoring a total of $0$, $0.0765157$ is the probability of scoring a total of $2$, etc. I have omitted the terms for $x^{14}$ through $x^{35}$ in the interest of saving space. If we sum the coefficients of $x^0$ through $x^{11}$, we find $$0.141652\, +0.0765157 + \dots + 0.0372368 = 0.889786$$ is the probability of winning a total score of $11$ or less. So the probability of winning a total of $12$ or more is $$1 - 0.889786 = \boxed{0.110214}$$

Gwendolyn Anderson · Answer 2 · 2021-09-11T20:01:52.653

While the answer can be achieved by computer algorithms, the stated problem is to estimate the chance of winning. So I will use a standard normal approximation to the probability of a sum of random values which are not normally distributed. This approach presumes the probability course or book does not cover generating functions and does not rely upon spreadsheets or computer coding to solve problems. There are six coin draws, sampling with replacement from five coin values, with the same discrete random distribution for each draw. Then there is a conditional probability of winning a coin given the value of the draw. I will define these random variables:

$X : $ the chance of drawing a particular coin value

$Y :$ the chance of winning the coin, given its value

$V :$ the value of a coin that has been drawn and won, or zero if lost

$F :$ the face value of a coin that has been drawn and won, given a win

$O :$ the number of zero values out of the six coin draws won/lost

$W :$ the chance of winning, i.e. a sum of at least 12 from six coin draws

$W$ is a Bernoulli random variable such that $W$ takes only two values as an indicator of a win or a loss, $W=1$ or $W=0$. $O$ follows a binomial distribution and we can see how useful it is to identify cases for $W$. First we need to know the chance of losing a coin draw compared to the chance of winning any value coin. This chance is described by the distribution of $V$ which is a conditional distribution, and we would use Baye's Theorem: Given a coin has already been drawn and won/lost, what is its value (zero if lost)? There are a five ways that $V$ can equal zero depending on the value of the coin draw $X$.

$P(V=k) = P(X=k)P(Y=k|X=k)$

$P(V=0)=(0.13)(1-\frac12)+(0.18)(1-\frac13)+(0.40)(1-\frac14)+(0.14)(1-\frac15)+(0.15)(1-\frac16) = 0.722$.

$P(V \neq0)= 1 - P(V=0) = 1 - 0.722 = 0.278$.

Now we can calculate the distribution of $O$ with $p = 0.722$ and $q = 1-p = 0.278$. Notice that we gain some insight into the cases for $W$ given $O$, since we know we need at least two winning coins to achieve a value of at least $12$. Similarly, if all coins are winning the value is not important since the smallest coin value possible is $2$ and $2 \times 6$ already gives $12$; any coin values greater than $2$ will still allow $W=1$.

$P(O=k) =$$ 6\choose{k}$$ 0.722^k0.278^{(6-k)}$

$P(O=0) \approx 0.4616$% $\Rightarrow$ automatic win

$P(O=1) \approx 0.7193$%

$P(O=2) \approx 4.6703$%

$P(O=3) \approx 16.1725$%

$P(O=4) \approx 31.5014$%

$P(O=5) \approx 32.7252$% $\Rightarrow$ automatic lose

$P(O=6) \approx 14.1652$% $\Rightarrow$ automatic lose

The next easiest case to determine a win is for $P(O=4)$ since when there are only two coins won they must both be six, and the chance is given by $[P(F=6)]^2$. The distribution for $F$ is conditional, based on the distribution of $V$. We haven't fully calculated that distribution of $V$ which is also conditional. To do so, we are using Baye's Theorem: Given a coin has already been drawn and won/lost, what is its value (zero if lost)?

$P(V = k) = P(X=k)P(Y=k|X=k)$ $P(V = 0) = 0.722$

$P(V = 2) = P(X=2)P(Y=2|X=2) = (0.13)(\frac12) = 0.065$

$P(V = 3) = (0.18)(\frac13) = 0.060$

$P(V = 4) = (0.40)(\frac14) = 0.010$

$P(V = 5) = (0.14)(\frac15) = 0.028$

$P(V = 6) = (0.15)(\frac16) = 0.025$

But we do not want to calculate $[P(V=6)]^2$ since we are looking for the conditional case that the coin has been won, that is, $[P(F=6)]^2$.

$[P(F=6)] = [P(V=6)|(V \neq0)] = \frac{0.025}{0.278} \approx 9.0$%.

$[P(F=6)]^2 \approx (9.0$%$)^2 \approx 0.8$%.

For the cases where $O=1$ or $O=2$, we can use the binomial distribution to calculate the chances of the few combinations of wins. When 5 coins of 6 are won, they cannot all be 2's, nor could they be four 2's and a 3, for a win. When 4 coins of 6 are won, there are a few more losing combinations. But now we come to the formidable case where $O=3$, where the possible combinations for wins or losses are many. This is where we can use a standard normal approximation with n=3. Using the conditional distribution of V given the coin is won (non-zero), we need to know the mean and variance of the distribution. This distribution is skew, but the sum of three values, or similarly the average of three values, will be less skew and have a smaller variance, by the central limit theorem. That is, the average of a sample will tend towards the average of the population distribution, and moreso the larger the sample. The standard normal distribution will be used to select a z-value where the sum of three values is 11 or less for a lose, and 12 or more for a win. Since it is a discrete distribution for which we are using a continuous distribution to estimate, we will choose the dividing value between wins and loses as 11.5.

$[P(F=2)] = P(V=2|V\neq0) \approx 23.4$%

$[P(F=3)] = P(V=3|V\neq0) \approx 21.6$%

$[P(F=4)] = P(V=4|V\neq0) \approx 36.0$%

$[P(F=5)] = P(V=5|V\neq0) \approx 10.1$%

$[P(F=6)] = P(V=6|V\neq0) \approx 9.0$%

$E(F) \approx 3.597$

$E(F^2) \approx 14.388$

$VAR(F) \approx 14.388 - 3.597^2 \approx 1.449$

The sum of three random variables $(F_1 + F_2 + F_3)$ can be approximated by a normal distribution with mean $3\mu$ and variance $3\sigma^2$. So

$3F \sim \mathscr{N}(3\mu, \sqrt{n\sigma})$

The left-tail z-value for losses, sums 11.5 or below, is given by:

$Z_{0,1} \approx \frac{11.5 - 3\times(3.597)}{\sqrt{3\times1.449}} \approx \frac{0.7086}{2.085} \approx 0.34$

and the left-hand z-value for $0.34$ from the standard normal table is $0.63307$, representing the chance of a lose ($W=0$), when $O=3$. The chance of a win ($W=1$) when $O=3$ is $ \approx 1 - 0.63307 \approx 0.36693$. The figures given in the table are by far too exacting for the roughness of this estimation. The estimate is better for larger sample size as the distribution becomes less skew.

Putting the chances of wins together for all cases of $O$,

$P(W=1) \approx [(100\%)\times(0.05\%)] + [(99.6\%)\times(0.72\%)] + [(91.2\%)\times(4.67\%)] + [(36.7\%)\times(16.17\%)] + [(0.08\%)\times(31.50\%)] + [(0\%)\times(32.73\%)] + [(0\%)\times(32.73\%)] \approx 11.2\%$

We can see from this expression that the majority of wins occur when $O=3$. Since the other cases have small chances of wins, this estimate relies mainly on the standard normal approximation for a skew distribution when the sample size is $n=3$, and very little upon the cases with exact calculations. When $O \lt 3$, these outcomes are few since the chance of winning coins is small. When $O \gt 3$, these outcomes are common but there are few coins won which then need to be of higher value each which is less likely.

Let me know if you want me to finish calculation the cases for $O=1$ and $O=2$.

Finding the probability of an event governed by two random processes

2 Answers2