Distribution of a combination of four uniformly distributed variables: $ X_1+X_2 +\sqrt{(X_2 - X_1)^2 + (Y_2 - Y_1)^2}$

Question

My problem involves four random variables $X_1, Y_1, X_2, Y_2 \sim U(0,1)$ in the expression $Z = X_1 + X_2 + \sqrt{(X_2 - X_1)^2 + (Y_2 - Y_1)^2}$. From what I understand so far, I need to find the PDF of the overall distribution to be able to find $P(Z>z)$.

Attempts

I first thought that finding the expected value of $\sqrt{(X_2 - X_1)^2 + (Y_2 - Y_1)^2}$ by quad-integrating from 0 to 1 would greatly simplify the problem. This led to attempts using geometric probability. Then I realized that the presence of the $X_1$ and $X_2$ outside of this term complicates things.

I have come across a lot of posts addressing the simple combination of uniformly distributed variables, such as $X + Y$ and $X - Y$. Again, the expression this problem involves is much more complicated, which is why I am asking this question.

I have also used simulation to plot what the distribution looks like, which assures me there is an answer (whether it is closed-form, I do not know) but gives me no clue how to get there.

Photo of Simulated Distribution

So - how do I go about finding the PDF of $Z = X_1 + X_2 + \sqrt{(X_2 - X_1)^2 + (Y_2 - Y_1)^2}$?

I need the exact distribution because I am looking for the closed-form probability that the expression is greater than 2.

Thanks in advance!

Why are you looking for the exact distribution, which can be complex? You may use the empirical cdf, obtained from a large-size sample. — Amir, Feb 03 '24 at 22:57
I have to find the closed form probability that the expression is greater than 2. Therefore any estimation techniques would be sub-optimal. — Luke, Feb 04 '24 at 00:54
Please don't use "Edit:". Instead, revise the question so it reads well for someone who encounters it for the first time. See https://cs.meta.stackexchange.com/q/657/755 — D.W., Feb 04 '24 at 04:10
In view of the similarity with this question, please let us know the context that led you to consider this problem. — joriki, Feb 08 '24 at 11:17

Amir · Accepted Answer · 2024-02-04T11:42:12.960

The exact probabilty of that the expression is greater than 2 is

$$\color{blue}{\dfrac{12\ln\left(2\right)+13}{144}}\approx 0.1480400428244399,$$

obtained from this integral:

$$2\int_0^1 \frac{x^2}{4} \cdot\left(1-\ln\left(\frac{x^2}{4}\right)\right)\left(1-x\right) \text{d}x.$$

Let $X_1, Y_1, X_2, Y_2 \sim U(0,1)$, the expression

$$Z = X_1+X_2 +\sqrt{(X_2 - X_1)^2 + (Y_2 - Y_1)^2},$$

is given and we want to determine the probability $\mathbb P (Z>2)$.

By defining $W_1=1-X_1, W_2=1-X_2 \sim U(0,1)$, we have the following key equivalence:

$$\color{blue}{\mathbb P (Z>2)=\mathbb P \left ( W_1W_2\le \frac{1}{4}(Y_1-Y_2)^2 \right)}=2\int_{0}^1 F_{W_1W_2}\left(\frac{x^2}{4}\right)\left(1-x\right) \text{d}x=2\int_0^1 \frac{x^2}{4} \left(1-\ln\left(\frac{x^2}{4}\right)\right)\left(1-x\right) \text{d}x.$$

Above, we used the following two results:

$$F_{W_1W_2}(t)=t(1-\ln t), \, t \in [0,1]$$

and

$$f_{|Y_1-Y_2|}(t)=2(1-t), \, t \in [0,1];$$

see here and here for more details.

For the general case $a \ge -2$, we have

$$\mathbb P (Z>2+a)=\mathbb P \left ( (2W_1+a)(2W_2+a)\le (Y_1-Y_2)^2 \right).$$

So you only need to find the cdf of $(2W_1+a)(2W_2+a)$ and follow the same method used above for $a=0$.

score 0 · Answer 2 · answered Feb 04 '24 at 02:10

I'm afraid this answer is not going to be very satisfactory, but with Mathematica, the command

1 - CDF[TransformedDistribution[x1 + x2 + Sqrt[(x1 - x2)^2 + (y1 - y2)^2],
  {Distributed[x1, UniformDistribution[{0, 1}]], 
   Distributed[x2, UniformDistribution[{0, 1}]], 
   Distributed[y1, UniformDistribution[{0, 1}]], 
   Distributed[y2, UniformDistribution[{0, 1}]]}], 2]

yields the result $$\Pr[Z > 2] = \frac{13 + 12 \log 2}{144} \approx 0.14804.$$ This is corroborated via simulation, again in Mathematica for $n = 10^7$ simulations:

Length[Select[ParallelTable[
   #[[1]] + #[[2]] + Sqrt[(#[[1]] - #[[2]])^2 + (#[[3]] - #[[4]])^2] &
   [RandomReal[{0, 1}, 4]], 10^7], # > 2 &]] / 10^7 // N

which yielded $0.1480701$ for instance. The general PDF seems to be quite complicated, as it has four cases:

$$f_Z(z) = \begin{cases} \frac{(8-3z)z^2}{12}, & 0 \le z \le 1 \\ \frac{z}{2} - \frac{1}{12z}, & 1 < z \le 2 \\ \frac{z^3}{4} + \frac{2z^2}{3} - \frac{25z}{6} + \frac{14}{3} - \frac{1}{12z} - \frac{4(z-1)\sqrt{z(z-2)}}{3}, & 2 < z \le 1 + \sqrt{2} \\ \frac{(3-z)^3(3z-5)}{12(z-2)}, & 1 + \sqrt{2} < z \le 3. \end{cases}$$

This suggests that the PDF can be calculated via elementary methods, but the full computation would be very tedious. A plot of $f$ is shown below:

Wow! What an organized, concise answer. The first part helped me realize that Mathematica can actually evaluate a closed form probability (which for some reason was not working for me before). Thank you so much! — Luke, Feb 04 '24 at 03:09

Distribution of a combination of four uniformly distributed variables: $ X_1+X_2 +\sqrt{(X_2 - X_1)^2 + (Y_2 - Y_1)^2}$

2 Answers2

Linked