2

The problem is to find the distribution of $X_1\mid M$ where $M$ is the maximum of the i.i.d. random variables $X \sim U(0,\theta)$. I have a complete solution but am having trouble justifying one step. We use Bayes' Theorem for CDF's to get started:

$$ P(X_1 < x_1 \mid M < m) = \frac{P(M < m \mid X_1 < x_1) P(X_1 < x_1)}{P(M < m)} $$

The cdf's for $M$ and $X_1$ are $(m/\theta)^n$, by independence, and $x_1/\theta$. The cdf for $M\mid X_1$ is $(m/\theta)^{n-1} {\bf 1} [x_1 \leq m]$. The justification I have is that if the observed value $x_1$ is greater than $m$, then $m$ cannot be the maximum. So, I threw the indicator on the cdf in order to justify that $M\mid X_1$ is just the distribution of the maximum excluding $X_1$. So,

$$ \frac{P(M < m \mid X_1 < x_1) P(X_1 < x_1)}{P(M < m)} = \frac{(x_1/\theta) (m/\theta)^{n-1}}{(m/\theta)^n} = \frac{x_1}{m} $$

It follows that $X_1\mid M \sim U(0,m)$.

Is my justification for the distribution of $M\mid X_1$ correct? I believe my final answer is intuitive.

  • 1
    "It follows that $X_1|M\sim U(0,m)$" What is $m$ in this statement? – drhab Jan 05 '18 at 18:51
  • $m$ was the observed value of $M$. Perhaps I should write this as $U(0,M)$ or $X_1 | M = m \sim U(0,m)$. – misogrumpy Jan 05 '18 at 18:55
  • 2
    Then I think it is wrong. There must be a positive probability for $X_1$ to take value $m$ under the condition: $P(X_1=m\mid M=m)>0$. This because there is a positive probability that $X_1$ will be the maximum. – drhab Jan 05 '18 at 18:59
  • 1
    Are you sure the problem is to compute $P(X_1 \leq a \mid \max X_i \leq b)$? It seems like it should be $P(X_1 \leq a \mid \max X_i=b)$. – Ian Jan 05 '18 at 19:04
  • That is fair. Since we know that one of $X_1, \dots, X_n$ takes on the value $m$, it should be that there is a positive probability that $X_1 = m$ given that $M =m$. Perhaps it is similar to a multinomial distribution. Nonetheless, I still don't know how to handle $P(M < m | X_1 < x_1)$. – misogrumpy Jan 05 '18 at 19:05
  • @Ian that might be the case. The problem was written without specifying. I chose to interpret it as $M < m$ since we were dealing with a continuous distribution, namely $X$ is uniform.

    Does $P(M=m)$ make sense? Or is $P(M=m) = 0$. I thought that $M$ would have a continuous distribution.

    – misogrumpy Jan 05 '18 at 19:13
  • $M$ does have a continuous distribution, but you can still condition on null sets. (Of course if you do then Bayes' rule doesn't do you any good.) – Ian Jan 05 '18 at 19:19
  • I'm still not seeing how this can help. Perhaps I could condition on the event $M \in (m, m + dm)$. Then given that $x_1 < m$, we would have to deal with the probability that one of $X_2, X_2, \dots, X_n$ are in $(m, m + dm)$? Finally, we could take the limit as $dm$ goes to 0. That might be doable. – misogrumpy Jan 05 '18 at 19:37
  • Isn't the conditional CDF $X|Y=y$ defined as $$F_{X|Y}(x|y)=\int^x_{-\infty} \frac{f_{X,Y}(u,y)}{f_Y(y)},du$$ Can you find the joint PDF of $X_1$ and $M$ and the PDF of $M$? – Shashi Jan 05 '18 at 19:40
  • The accepted answer is not quite correct. The conditional distribution is mixed (partially discrete and partially absolutely continuous). Related: https://math.stackexchange.com/a/1007778/321264. – StubbornAtom Feb 25 '20 at 18:07

2 Answers2

0

Suppose you're asking for $P(X_1 \leq a \mid \max X_i=b)$. Consider $n=2$. Then the relevant set is the intersection between the line segments between $(0,b)$ to $(b,b)$ and $(b,0)$ to $(b,b)$, with the half-plane $X_1 \leq a$. Assuming of course that $0<a<b$, then you're looking at the ratio of the lengths of the segment $(0,b)$ to $(a,b)$ and the whole pair of segments. So that's $\frac{a}{2b}$. The results for $a \leq 0$ and $a \geq b$ are clearly $0$ and $1$ respectively.

For general $n$, what happens? Well, for $0<a<b$ again, you're missing one "face" which is the face where $X_1$ is actually the biggest one; this face has probability $1/n$. Other than that, the fraction of the "area" on each face where $X_1 \leq a$ is $a/b$. So you're looking at a general result of $\frac{a}{b} \left ( 1 - \frac{1}{n} \right )$. Note that this can be interpreted as $P(X_1 \leq a \mid X_1 \leq b) P(X_1<\max X_i)$.

Of course this approach is geometric, but you can use the general formulae for conditional CDFs/PDFs to do it analytically.

Ian
  • 101,645
0

We want the CDF of $X_1|M=m$. One knows that: \begin{align} F_{X_1|M}(x|m)=\int^x_{-\infty} \frac{f_{X_1,M}(u,m)}{f_M(m)}\,du \end{align} It is easy to find \begin{align} F_{X_1,M}(x,m)=P(X_1<x, M<m)= \begin{cases} \left(\frac{m}{\theta}\right)^n & \text{ if } & 0\leq m \leq x \leq \theta\\ \frac{x}{\theta}\left(\frac{m}{\theta}\right)^{n-1} & \text{ if } & 0\leq x < m \leq \theta\\ \end{cases} \end{align} Note that $F_{X_1,M}$ is not differentiable everywhere. But differentiating where it is differentiable yields: \begin{align} f_{X_1,M}(x,m)=\frac{(n-1)m^{n-2}}{\theta^n}\mathbf{1}_{0\leq x<m\leq\theta} \end{align} We also know that $f_M(m)=\frac{nm^{n-1}}{\theta^n}$, just differentiate the CDF you already have. So we get: \begin{align} F_{X_1|M}(x|m) = \begin{cases} 0 & \text{ if } & x<0\\ \frac{x(n-1)}{mn} & \text{ if } & 0\leq x<m\\ 1 & \text{ if } & m\leq x\\ \end{cases} \end{align}

Shashi
  • 8,738
  • Hi Shashi, thank you for your response. This method makes sense. I should have considered to use the definition earlier! (facepalm)

    One question. For $0 \leq m \leq x \leq \theta$, did you compute $P(X_1 < x , M < m)$ as $P(X_1 < x \mid M < m) P(M < m)$ where $P(M < m) = (m/\theta)^n$ and $P(X_1 < x \mid M < m) = 1$ since $m \leq x$? Or is there a more intuitive method to compute this using the joint?

    – misogrumpy Jan 06 '18 at 16:54
  • @JohnPortin there is no need to all that, \begin{align} P(X_1<x,M<m)&=P(X_1<x, X_1<m,...,X_n<m)\&=P(X_1<\min{x,m},X_2<m,...,X_n<m)\end{align} Using independence you can factor them all. The only interesting is $P(X_1<\min{x,m})$ and that makes the difference between the different cases $m\leq x $ and $x<m$. You got it now? – Shashi Jan 06 '18 at 17:06
  • 1
    Indeed. Thank you for spelling that out for me! – misogrumpy Jan 06 '18 at 20:40
  • Your joint pdf for $(X_1, M)$ should be 0 if $x < 0$. As it is written, you have $0 < x$. – misogrumpy Jan 07 '18 at 18:51
  • @misogrumpy now good? – Shashi Jan 07 '18 at 18:54
  • Joint pdf of $(X_1,M)$ does not exist because $P(X_1=M)>0$. So the starting definition of $F_{X_1\mid M}$ is not valid. – StubbornAtom Sep 17 '19 at 20:23
  • @StubbornAtom wow, never thought about this! Thanks for pointing that out. At this moment I don't have time to correct it and I don't even know how so I'll delete it. If it turns out to need a small fix you can edit it! – Shashi Sep 17 '19 at 22:06