1

My textbook, Introduction to Probability, by Blitzstein and Hwang, gives the following example:

Example 3

For $X, Y \stackrel{i.i.d.}{\sim}\text{Expo}(\lambda)$, find $\mathbb{E}[\max(X, Y) | \min(X, Y)]$.

Let $M = \max(X, Y)$ and $L = \min(X, Y)$.

By the memoryless property, $M - L$ is independent of $L$, and $M - L \sim \text{Expo}(\lambda)$.

$\dots$

The full example isn't important, because what my question pertains to are the two facts above:

  1. How does the memoryless property imply that $M - L$ is independent of $L$?

  2. How is it that $M - L \sim \text{Expo}(\lambda)$? In other words, how is it that the difference of two $\text{Expo}(\lambda)$ random variables has the same parameter?

I would greatly appreciate it if people could please take the time to clarify this.

The Pointer
  • 4,182

1 Answers1

3

How does the memoryless property imply that $M−L$ is independent of $L$?

How is it that $M−L\sim Expo(\lambda)$?

For convenience I will write $Z = M-L$.

I think the authors are trying to appeal to a "proof by obviousness". If you think of $X, Y$ as the typical exponential waiting times for two different buses, then once the first bus arrived, $Z=M-L$ is the time until the second bus arrives, but since the second bus is memoryless, it "didn't care" that time $L$ has elapsed. Hence $Z \sim Expo(\lambda)$.

Now the obvious problem with "proof by obviousness" is that some people find them non-obvious. :) So if you don't buy the hand-wavy argument above, we can try to prove it. For any $z, l > 0:$

$$ \begin{array}{rl} P(Z > z \mid L = l, X > Y) &= P(X-Y > z \mid X>Y=l) \\ &= P(X>z+l \mid X>l, Y=l) \\ &= P(X>z+l \mid X>l) \,\,\,\,\,\text{...because $X,Y$ independent} \\ &= e^{-\lambda z} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\text{...because $X$ is memoryless}\\ P(Z > z \mid L = l, Y > X) &= e^{-\lambda z}\,\,\,\,\,\,\,\,\,\,\,\,\,\,\text{...similarly}\\ P(Z > z \mid L = l) &= P(X>Y)\,P(Z > z \mid L = l, X > Y) \\ &\,\,\,\,\,\,\,\,+ P(Y>X)\, P(Z > z \mid L = l, Y > X)\\ &= e^{-\lambda z}\\ P(Z > z) &= \int_0^\infty P(Z > z \mid L = l) f_L(l) \,dl = e^{-\lambda z} \end{array} $$

I think the above is watertight, but even if not, you get the idea.

Combining the last two equations, we have:

$$\forall l>0: P(Z > z \mid L = l) = e^{-\lambda z} = P(Z>z)$$

which directly shows $Z,L$ independent, and $Z \sim Expo(\lambda)$.

how is it that the difference of two $Expo(\lambda)$ random variables has the same parameter?

This question has two different wrong ideas hidden behind it. $Z$ is indeed the difference of two random variables, i.e. $Z=M-L$, but neither $M$ nor $L$ is $\sim Expo(\lambda)$. Instead we have:

  • $X, Y \sim Expo(\lambda)$, given

  • $Z=M-L=\max(X,Y) - \min(X,Y) \sim Expo(\lambda)$, shown above

  • $L = \min(X,Y) \sim Expo(2\lambda)$, e.g. see here

  • $M = \max(X,Y) $ is not exponential at all, because its CDF is not in the required form:

$$P(M < a) = P(X<a, Y<a) = P(X<a)P(Y<a) = (1-e^{-\lambda a})^2$$

antkam
  • 15,363
  • Thanks for the answer. With your proof, it's worth mentioning what the definition of continuous memorylessness is: Suppose $X$ is a continuous random variable whose values lie in the non-negative real numbers $[0, \infty)$. The probability distribution of $X$ is memoryless precisely if for any non-negative real numbers $t$ and $s$, we have ${\displaystyle \Pr(X>t+s\mid X>t)=\Pr(X>s)}$ (from Wikipedia: https://en.wikipedia.org/wiki/Memorylessness#Continuous_memorylessness). – The Pointer Oct 22 '19 at 03:18
  • Although, I'm struggling to understand what $P(Z > z \mid L = l, X > Y)$ is, since it looks very dissimilar to the continuous memorylessness equation I posted above? What is the hypothesis here that the proof is starting from? – The Pointer Oct 22 '19 at 03:23
  • Also, how did you get that $P(X > z + l | X > l) = e^{-\lambda z}$? – The Pointer Oct 22 '19 at 03:24
  • $P(X > z + l \mid X > l) = P(X > z)$ because $X$ is memoryless. and $P(X > z) = e^{-\lambda z}$ because $X \sim Expo(\lambda)$. – antkam Oct 22 '19 at 05:03
  • Re: $P(Z > z \mid L = l, X > Y)$... My GOAL is to prove the conclusion that happened later, i.e. $\forall l > 0: P(Z > z \mid L = l) = P(Z > z) = e^{-\lambda z}$, where the first $=$ shows independence of $Z, L$ and the second $=$ shows $Z \sim Expo(\lambda)$. However, in order to reach that conclusion, I started with the component terms by further conditioning on $X>Y$ (and also $Y > X$). Conditioning on $X>Y$ allows me to identify $L=Y, Z=X-Y=X-l$ etc. and ultimately allows me to show $P(Z > z \mid L = l, X > Y)= e^{-\lambda z}$. Then I just needed to combine some terms to prove my GOAL. – antkam Oct 22 '19 at 05:09
  • This answer should be accepted. It is air-tight and clear. – Jake Mirra Oct 23 '19 at 18:49
  • @antkam Thanks for the clarification. How did you get $P(Z > z \mid L = l) = P(X>Y),P(Z > z \mid L = l, X > Y) + P(Y>X), P(Z > z \mid L = l, Y > X)\$ ? It looks, perhaps, like some application of the law of total probability? – The Pointer Oct 27 '19 at 04:08
  • Yes. The first term should technically be $P(X>Y\color{red}{\mid L = l}) P(Z > z \mid L = l, X > Y)$, and similarly for the second term, but since $X>Y$ is independent of the value of $L$, I skipped a step and simplified it right away via $P(X>Y\color{red}{\mid L = l}) = P(X>Y)$. – antkam Oct 27 '19 at 14:19
  • Can you please clarify what $$P(Z > z) = \int_0^\infty P(Z > z \mid L = l) f_L(l) ,dl$$ is? It resembles a CDF, but my understanding is that $\int_0^\infty P(Z > z \mid L = l) f_L(l)$ is not how you write the CDF $P(Z > z)$? – The Pointer Nov 08 '19 at 15:47
  • That equation is just the law of total probability, continuous version. See e.g. here. I was using it to formally prove an obvious fact: if $P(Z> z \mid L=l) = e^{-\lambda z}$ for any $l$, then the unconditioned $P(Z>z)$ is the same value $e^{-\lambda z}$. – antkam Nov 08 '19 at 15:55
  • @antkam Hmm, my textbook has the continuous version of the LOTP as $$f_X (x) = \int_{-\infty}^\infty f_{X | Y} (x | y) f_Y (y) \ dy,$$ which, it seems, disagrees with the formula in your link? And the other thing is that this is the PDF -- not the CDF (as I understand it, $P(Z > z)$ is a CDF). What do you think? – The Pointer Nov 08 '19 at 16:16
  • Your textbook version is for two densities. I'm using a version for an event $A=(Z>z)$ and a density. See the link in my previous comment. – antkam Nov 08 '19 at 16:27
  • @antkam Ahh, you're right; I found the following problem in my textbook, so the author must have left it as an exercise: Show that the following version of LOTP follows from Adam’s law: for any event $A$ and continuous r.v. $X$ with PDF $f_X$, $$P(A) = \int_{-\infty}^\infty P(A | X = x) f_X (x) \ dx.$$ – The Pointer Nov 08 '19 at 16:37