3

I was having some trouble doing some Bayesian probability problems and was wondering if I could get any help. I think I was able to get the first two but am confused on the last. If someone could please check my work to make sure I am correct and help me on the last question that would be great.

Here is the problem setup:

Let’s say that we work for the International Olympic Committee (IOC) as part of their Fight Against Doping (https://www.olympic.org/fight-against-doping). We have a drug test for a banned performance-enhancing drug (PED) that is 99.3% accurate at identifying an athlete that has the PED in their system. However, it is only $73\% $ accurate at identifying the absence of PED in the athlete’s system. From a scientific study we also have a strong reason to believe that only $3\% $ of Olympic athletes use this particular PED.

Here is the first question.

$1$. An athlete tests positive for the PED. Given that the test had a positive result, what is the probability the tested individual uses the PED?

My answer:

P = Positive N = Negative

$$p(P|PED) = 0.993$$

$$p(N|No \space PED) = 0.73$$

$$p(PED) = 0.03$$

$$p(P) = p(P|PED)*p(PED) + p(P|No \space PED)*p(No \space PED) = 0.993 x 0.03 + (1 - 0.73) x 0.97 = 0.292$$

$$p(PED|P) = 0.993 x 0.03 / 0.292 = 0.102 \space - \space answer$$

Here is the second question.

$2$. As an employee of the IOC, we don’t want to needlessly ban an athlete from the chance to compete in the Olympics. As a result, we decide to institute a protocol that if an athlete tests positive for the use of the PED we will administer a second test. The second test is less accurate at identifying an athlete that has the PED in their system, at only $81\% $, but is more accurate at identifying the absence of PED in the athlete’s system, with a probability of $90\% $. If the athlete tests positive for the PED in both the first and second test, what is the probability that the accused individual uses the banned PED? (You may assume the outcome of the second drug test is conditionally independent of the outcome of the first drug test).

My answer:

$P_2$ = Positive 2nd Test

$N_2$ = Negative 2nd Test

$$p(P_2|PED)$ = 0.81$$

$$p(N_2|No \space PED)$ = 0.90$$

$$p(PED|P_2) = p(P_2|PED)*p(PED)/p(P_2)$$

$$p(P_2) = p(P_2|PED)*p(PED) + p(P_2|No \space PED)*p(No \space PED) = 0.81*0.03 + (1-0.90)*(1-0.03) = 0.124$$

$$p(PED|P_2) = p(PED|P_2) * p(PED)/p(P_2) = 0.81*0.03/0.124 = 0.196$$

Thus the probability of using PED given that you tested positive for both tests is:

$$p(Positive for both) = p(P|PED)*p(P_2|PED) = 0.102*0.196 = 0.0199 \approx 0.20 \space - \space answer$$

Here is question $3$ - this is the one I am not sure on.

  1. Our information that only $3\% $ of Olympic athletes use the PED came from a study of $300$ athletes. This year we tested $500$ athletes and confirmed that $11$ of them used the banned substance. In both cases, only a sample of all athletes to complete in the Olympics were tested for the PED. As a result there is some uncertainty, so we decide we would like to express what we have learned as a probability distribution. In two years when we being testing athletes again, what is the fully specified distribution that we will we use for the percentage of Olympic athletes that use the PED?

I don't get what it means by what the probability distribution would be. Would it be some derivation of a Bayesian posterior distribution with $0.03$ as the prior, though I'm not sure what the likelihood or marginal would be.

I know this is the Bayes rule formula:

$$p(\theta|x) = p(x|\theta)*p(\theta)/p(x)$$

But how would I change this to suit the question?

Apologies for the long post, please let me know if you need any more information or clarification.

Thank you for reading

user288972
  • 2,360
mrsquid
  • 143

1 Answers1

0

Let’s say that we work for the International Olympic Committee (IOC) as part of their Fight Against Doping (https://www.olympic.org/fight-against-doping). We have a drug test for a banned performance-enhancing drug (PED) that is $99.3\%$ accurate at identifying an athlete that has the PED in their system. However, it is only $73\%$ accurate at identifying the absence of PED in the athlete’s system. From a scientific study we also have a strong reason to believe that only $3\%$ of Olympic athletes use this particular PED.

Let's save on typing and use $T_1$ for event of testing positive, $U$ for the event of using the performance enhancement.

We are provided that: $\mathsf P(T_1\mid U)=0.993, \mathsf P(T_1^\complement\mid U^\complement)=0.730, \mathsf P(U)=0.03$

  1. An athlete tests positive for the PED. Given that the test had a positive result, what is the probability the tested individual uses the PED?

So, using the Bayes' Rule formula (and Law of Total Probability):

We seek $\mathsf P(U\mid T_1)~{=\dfrac{\mathsf P(T_1\mid U)\cdot\mathsf P(U)}{\mathsf P(T_1)}\\=\dfrac{\mathsf P(T_1\mid U)\cdot\mathsf P(U)}{\mathsf P(T_1\mid U)\cdot\mathsf P(U)+\mathsf P(T_1\mid U^\complement)\cdot\mathsf P(U^\complement)}\\=\dfrac{0.993\cdot0.03}{0.993\cdot0.03+(1-0.73)\cdot(1-0.03)}\\\approx0.102}$

As you ultimately had; you just needed to put it together.


  1. As an employee of the IOC, we don’t want to needlessly ban an athlete from the chance to compete in the Olympics. As a result, we decide to institute a protocol that if an athlete tests positive for the use of the PED we will administer a second test. The second test is less accurate at identifying an athlete that has the PED in their system, at only $81\%$, but is more accurate at identifying the absence of PED in the athlete’s system, with a probability of $90\%$. If the athlete tests positive for the PED in both the first and second test, what is the probability that the accused individual uses the banned PED? (You may assume the outcome of the second drug test is conditionally independent of the outcome of the first drug test).

Using $T_2$ for the event for a positive result on the second test. ( We might note that the protocol is that the second test only occurs if the first test is positive, but that does not affect the answer to this question. )

So we have $\mathsf P(T_2\mid U)=0.81, \mathsf P(T_2^\complement\mid T_1)=0.90$

And the conditional indepenence tells us: $\mathsf P(T_1,T_2\mid U)=\mathsf P(T_1\mid U)\,\mathsf P(T_2\mid U)\\\mathsf P(T_1,T_2\mid U^\complement)=\mathsf P(T_1\mid U^\complement)\,\mathsf P(T_2\mid U^\complement)\\\text{etc}$

We seek: ${\mathsf P(U\mid T_1, T_2)~}{=\dfrac{\mathsf P(T_1, T_2\mid U)\cdot\mathsf P(U)}{\mathsf P(T_1,T_2)}\\=\dfrac{\mathsf P(T_1\mid U)\cdot\mathsf P(T_2\mid U)\cdot\mathsf P(U)}{\mathsf P(T_1\mid U)\cdot\mathsf P(T_2\mid U)\cdot\mathsf P(U)+\mathsf P(T_1\mid U^\complement)\cdot\mathsf P(T_2\mid U^\complement)\cdot\mathsf P(U^\complement)}\\~~\vdots }$

Graham Kemp
  • 129,094
  • That makes sense, I knew something was up when the probability was only 0.02. Thank you for the clear answer! – mrsquid Feb 21 '18 at 17:17
  • Can you maybe also help me with this qestion: https://math.stackexchange.com/questions/2740647/bayes-theorem-disease-probability – John Smith Apr 17 '18 at 06:31