1

I currently have the following piece of code in Python

def f(x, y, tx, ty):
    return something
def simulate():
    x, y = random(), random()
    tx, ty = uniform(0, 1-x), uniform(0, 1-y)
    return f(x, y, tx, ty)
numTrials = 10000 #a large number
print(sum(simulate() for i in range(numTrials))/numTrials)

What this is doing is randomly picking two numbers, $x$ and $y$ $(0 \le x, y < 1)$. It then randomly picks two numbers, $t_x$ and $t_y$ $(0 \le t_x < 1-x, 0 \le t_y < 1-y)$. Using this, I then call a function $f(x, y, t_x, t_y)$ on it.

Then in the main part of the code, I am trying to find the average value of simulate (not the average value of $f(x, y, t_x, t_y)$). I am trying to convert the problem of finding an average value to something that can be worked on without a simulation. I tried using $$\int_0^1\int_0^1\int_0^{1-x}\int_0^{1-y} f(x, y, t_x, t_y) dt_ydt_xdxdy$$, but this doesn't seem to work.

For example, if I have $f = x+y+t_x+t_y$, my code outputs a value of around $1.5$. The integral on the other hand would be $$\int_0^1\int_0^1\int_0^{1-x}\int_0^{1-y} (x+y+t_x+t_y) dt_ydt_xdxdy = \frac{1}{3}$$

My questions:

  1. What is the reason for this discrepancy?

  2. What would the correct setup of the integral be?

Edit: I just used $f(x, y, t_x, t_y) = x + y + t_x + t_y$ as an example. My true question is for $f$ being any function.

  • The integral is not the average value yet. You have to divide by the "volume" of the region of integration, i.e. $\int_0^1 \int_0^1 \int_0^{1-x} \int_0^{1-y} 1 : dt_y dt_x dx dy$ – Ninad Munshi Jan 08 '20 at 06:31
  • As a sanity check, this evaluates to $\frac{1}{4}$. Taking your integral and dividing by this value gives $\frac{4}{3}$ which could be close enough to $1.5$ depending on how you are evaluating the code. – Ninad Munshi Jan 08 '20 at 06:38
  • It was very close to $1.5$. I'm pretty sure the actual value was $3/2$, so I don't think the modification is just dividing by $1/4$. – Varun Vejalla Jan 08 '20 at 07:11
  • That doesn't change the fact that mathematically the operation you have is not the average and suggestion I gave is. What that means is that your code is not really returning the average value of the function. – Ninad Munshi Jan 08 '20 at 07:17
  • I don't think I explained this clearly - I am not trying to find the average value of $f(x, y, t_x, t_y)$. I am trying to find the average value of simulate. – Varun Vejalla Jan 08 '20 at 07:33
  • The only possible problem I could see is if you are using the wrong probability distribution. The formula for an expected value is $E[f(X)] = \int f(x) p_x(x) dx$ where $p_x(x)$ is the probability. You're assuming that the probability is uniform, i.e. all $x$, $y$, $t_x$, and $t_y$ are equally likely to be chosen when that may not be true. You'll have to find this probability distribution first. – Ninad Munshi Jan 08 '20 at 12:10

1 Answers1

1

For your example we'd have by linearity of expectation that

$$\Bbb E(X+Y+T_X+T_Y) = \Bbb E(X) + \Bbb E(Y) + \Bbb E(T_X) + \Bbb E(T_Y).$$

Because $\Bbb E(X) = \Bbb E(Y) = 1/2$ and $\Bbb E(T_X) = \Bbb E(T_Y)$, the result would hence be $1 + 2\,\Bbb E(T_X)$. Let's calculate $\Bbb E(T_X)$. We use the law of total expectation.

\begin{align} \Bbb E(T_X) &= \Bbb E(\Bbb E(T_X|X)) \\&= \int_{0}^1 \Bbb E(T_X|X = t)\, dt \\&= \int_{0}^1 \Bbb E\big(\text{Unif}(0,1-t)\big)\, dt \\&= \int_{0}^1 \frac {1-t}2\, dt = \frac14, \end{align}

which agrees with your simulated result of $1.5$.


In the general case, we may actually apply the law of total expectation multiple times. Let $T_X$ be the random variable for the $t_x$ and $T_x \sim \text{Unif}(0,1-x)$ be a particular realization of $T_X$ for $X=x$, and similarly for $T_Y$ and $T_y$.

Let's see:

\begin{align} \Bbb E\Big(f(X,Y,T_X,T_Y)\Big) &= \Bbb E\Big(\Bbb E\big(f(X,Y,T_X,T_Y)|X\big)\Big) \\&= \int_{0}^1 \Bbb E\big(f(X,Y,T_X,T_Y)|X = x\big)\, dx \\&= \int_{0}^1 \Bbb E\big(f(x,Y,T_x,T_Y)\big)\, dx \\&= \int_{0}^1 \Bbb E\Big( \Bbb E\big(f(x,Y,T_x,T_Y)|Y\big)\Big)\, dx \\&= \int_0^1\int_{0}^1 \Bbb E\big(f(x,Y,T_x,T_Y)|Y = y\big)\, dydx \\&= \int_0^1\int_{0}^1 \Bbb E\big(f(x,y,T_x,T_y)\big)\, dydx \\&= \int_0^1\int_{0}^1 \Bbb E\Big( \Bbb E\big(f(x,y,T_x,T_y)|T_x\big)\Big)\, dydx \end{align}

Now, we remind you that the probability density function for $\text{Unif}(0,1-x)$ is $1/(1-x)$. We continue our derivation.

\begin{align} \Bbb E\Big(f(X,Y,T_X,T_Y)\Big) &= \int_0^1\int_{0}^1\int_0^{1-x} \Bbb E\big(f(x,y,T_x,T_y)|T_x = t_x\big)\frac1{1-x}\, dt_xdydx \\&= \int_0^1\int_{0}^1\int_0^{1-x} \Bbb E\big(f(x,y,t_x,T_y)\big)\frac1{1-x}\, dt_xdydx \\&= \int_0^1\int_{0}^1\int_0^{1-x}\int_0^{1-y} f(x,y,t_x,t_y)\frac1{1-x}\frac1{1-y}\, dt_ydt_xdydx. \end{align}

Hence:

$$ \bbox[5px,border:2px solid red]{\Bbb E\Big(f(X,Y,T_X,T_Y)\Big) = \int_0^1\int_{0}^1\int_0^{1-x}\int_0^{1-y} \frac{f(x,y,t_x,t_y)}{(1-x)(1-y)}\, dt_ydt_xdydx} $$

As a sanity check, we have that

$$\int_0^1\int_{0}^1\int_0^{1-x}\int_0^{1-y} \frac{x+y+t_x+t_y}{(1-x)(1-y)}\, dt_ydt_xdydx = 1.5$$

as you had found.


As a bonus, I calculated the probability density function for $T_X$, but I ended up not using it in the approach above. Here it is; we start with the cumulative distribution function $F_{T_X}(t)$.

\begin{align} \Bbb P(T_X\leqslant t) &= \int_0^1 \Bbb P(T_X\leqslant t|X=x)\, dx\tag{$*$} \\&= \int_0^1 \Bbb P(\text{Unif}(0,1-x)\leqslant t)\, dx \\&= \int_0^{1-t} \Bbb P(\text{Unif}(0,1-x)\leqslant t)\, dx + \int_{1-t}^1 \underbrace{\Bbb P(\text{Unif}(0,1-x)\leqslant t)}_1\, dx \\&= \int_0^{1-t}\frac t{1-x} \,dx + t = t -t\log(t), \end{align}

where in $(*)$ we used the law of total probability for continuous distributions. It follows that probability density function is $f_{T_X}(t) = \frac d {dt}F_{T_X}(t) = -\log(t)$.

Fimpellizzeri
  • 23,126
  • How can this generalize to other functions? I am mainly interested in the setup of the integral. For example, what if $f(x, y, t_x, t_y) = x \cdot y \cdot t_x \cdot t_y$? – Varun Vejalla Jan 08 '20 at 19:07
  • See my edit. It's jut law of total expectation all the way down. – Fimpellizzeri Jan 08 '20 at 20:23
  • I've checked the formula against your simulation with various $f$s and at this point I'm pretty confident the result is right. – Fimpellizzeri Jan 09 '20 at 12:50