2

My problem involves four random variables $X_1, Y_1, X_2, Y_2 \sim U(0,1)$ in the expression $Z = X_1 + X_2 + \sqrt{(X_2 - X_1)^2 + (Y_2 - Y_1)^2}$. From what I understand so far, I need to find the PDF of the overall distribution to be able to find $P(Z>z)$.

Attempts

I first thought that finding the expected value of $\sqrt{(X_2 - X_1)^2 + (Y_2 - Y_1)^2}$ by quad-integrating from 0 to 1 would greatly simplify the problem. This led to attempts using geometric probability. Then I realized that the presence of the $X_1$ and $X_2$ outside of this term complicates things.

I have come across a lot of posts addressing the simple combination of uniformly distributed variables, such as $X + Y$ and $X - Y$. Again, the expression this problem involves is much more complicated, which is why I am asking this question.

I have also used simulation to plot what the distribution looks like, which assures me there is an answer (whether it is closed-form, I do not know) but gives me no clue how to get there.

Photo of Simulated Distribution

So - how do I go about finding the PDF of $Z = X_1 + X_2 + \sqrt{(X_2 - X_1)^2 + (Y_2 - Y_1)^2}$?

I need the exact distribution because I am looking for the closed-form probability that the expression is greater than 2.

Thanks in advance!

Amir
  • 4,305
Luke
  • 23
  • 3
  • Why are you looking for the exact distribution, which can be complex? You may use the empirical cdf, obtained from a large-size sample. – Amir Feb 03 '24 at 22:57
  • I have to find the closed form probability that the expression is greater than 2. Therefore any estimation techniques would be sub-optimal. – Luke Feb 04 '24 at 00:54
  • Please don't use "Edit:". Instead, revise the question so it reads well for someone who encounters it for the first time. See https://cs.meta.stackexchange.com/q/657/755 – D.W. Feb 04 '24 at 04:10
  • Considering your comment, check the answer please. – Amir Feb 04 '24 at 08:40
  • In view of the similarity with this question, please let us know the context that led you to consider this problem. – joriki Feb 08 '24 at 11:17

2 Answers2

3

The exact probabilty of that the expression is greater than 2 is

$$\color{blue}{\dfrac{12\ln\left(2\right)+13}{144}}\approx 0.1480400428244399,$$

obtained from this integral:

$$2\int_0^1 \frac{x^2}{4} \cdot\left(1-\ln\left(\frac{x^2}{4}\right)\right)\left(1-x\right) \text{d}x.$$

Let $X_1, Y_1, X_2, Y_2 \sim U(0,1)$, the expression

$$Z = X_1+X_2 +\sqrt{(X_2 - X_1)^2 + (Y_2 - Y_1)^2},$$

is given and we want to determine the probability $\mathbb P (Z>2)$.

By defining $W_1=1-X_1, W_2=1-X_2 \sim U(0,1)$, we have the following key equivalence:

$$\color{blue}{\mathbb P (Z>2)=\mathbb P \left ( W_1W_2\le \frac{1}{4}(Y_1-Y_2)^2 \right)}=2\int_{0}^1 F_{W_1W_2}\left(\frac{x^2}{4}\right)\left(1-x\right) \text{d}x=2\int_0^1 \frac{x^2}{4} \left(1-\ln\left(\frac{x^2}{4}\right)\right)\left(1-x\right) \text{d}x.$$

Above, we used the following two results:

$$F_{W_1W_2}(t)=t(1-\ln t), \, t \in [0,1]$$

and

$$f_{|Y_1-Y_2|}(t)=2(1-t), \, t \in [0,1];$$

see here and here for more details.


For the general case $a \ge -2$, we have

$$\mathbb P (Z>2+a)=\mathbb P \left ( (2W_1+a)(2W_2+a)\le (Y_1-Y_2)^2 \right).$$

So you only need to find the cdf of $(2W_1+a)(2W_2+a)$ and follow the same method used above for $a=0$.

Amir
  • 4,305
0

I'm afraid this answer is not going to be very satisfactory, but with Mathematica, the command

1 - CDF[TransformedDistribution[x1 + x2 + Sqrt[(x1 - x2)^2 + (y1 - y2)^2],
  {Distributed[x1, UniformDistribution[{0, 1}]], 
   Distributed[x2, UniformDistribution[{0, 1}]], 
   Distributed[y1, UniformDistribution[{0, 1}]], 
   Distributed[y2, UniformDistribution[{0, 1}]]}], 2]

yields the result $$\Pr[Z > 2] = \frac{13 + 12 \log 2}{144} \approx 0.14804.$$ This is corroborated via simulation, again in Mathematica for $n = 10^7$ simulations:

Length[Select[ParallelTable[
   #[[1]] + #[[2]] + Sqrt[(#[[1]] - #[[2]])^2 + (#[[3]] - #[[4]])^2] &
   [RandomReal[{0, 1}, 4]], 10^7], # > 2 &]] / 10^7 // N

which yielded $0.1480701$ for instance. The general PDF seems to be quite complicated, as it has four cases:

$$f_Z(z) = \begin{cases} \frac{(8-3z)z^2}{12}, & 0 \le z \le 1 \\ \frac{z}{2} - \frac{1}{12z}, & 1 < z \le 2 \\ \frac{z^3}{4} + \frac{2z^2}{3} - \frac{25z}{6} + \frac{14}{3} - \frac{1}{12z} - \frac{4(z-1)\sqrt{z(z-2)}}{3}, & 2 < z \le 1 + \sqrt{2} \\ \frac{(3-z)^3(3z-5)}{12(z-2)}, & 1 + \sqrt{2} < z \le 3. \end{cases}$$

This suggests that the PDF can be calculated via elementary methods, but the full computation would be very tedious. A plot of $f$ is shown below:

enter image description here

heropup
  • 135,869
  • Wow! What an organized, concise answer. The first part helped me realize that Mathematica can actually evaluate a closed form probability (which for some reason was not working for me before). Thank you so much! – Luke Feb 04 '24 at 03:09