1

Let $\mathbf{x}$ be a $N$-dimensional random vector with independent Gaussian entries, i.e., $\mathbf{x} \sim \mathcal{N}(0, \mathbf{I}_{N})$. Furthermore, let $\mathbf{a}_{1} \in \mathbb{R}^{N}$ and $\mathbf{a}_{2} \in \mathbb{R}^{N}$ be two given vectors. I'd like to derive the expression of \begin{align} \mathbb{E}[\mathrm{sgn}(\mathbf{a}_{1}^{T} \mathbf{x}) \mathrm{sgn}(\mathbf{a}_{2}^{T} \mathbf{x})] & = \mathbb{P}[\mathbf{a}_{1}^{T} \mathbf{x} > 0 \land \mathbf{a}_{2}^{T} \mathbf{x} >0] \\ & \ \ \ \ + \mathbb{P}[\mathbf{a}_{1}^{T} \mathbf{x} > 0 \land \mathbf{a}_{2}^{T} \mathbf{x} < 0] \\ & \ \ \ \ - \mathbb{P}[\mathbf{a}_{1}^{T} \mathbf{x} < 0 \land \mathbf{a}_{2}^{T} \mathbf{x} > 0] \\ & \ \ \ \ - \mathbb{P}[\mathbf{a}_{1}^{T} \mathbf{x} < 0 \land \mathbf{a}_{2}^{T} \mathbf{x} < 0]. \end{align}

Edit: I found the answer to be

$$\mathbb{E}[\mathrm{sgn}(\mathbf{a}_{1}^{T} \mathbf{x}) \mathrm{sgn}(\mathbf{a}_{2}^{T} \mathbf{x})] = \frac{2}{\pi} \arcsin \bigg( \frac{\mathbf{a}_{1}^{T} \mathbf{a}_{2}}{\|\mathbf{a}_{1}\| \, \|\mathbf{a}_{2}\|} \bigg)$$

but I cannot understand the reasoning behind this formula. Furthermore, I'd like to understand how to obtain the individual joint probability terms, e.g., $\mathbb{P}[\mathbf{a}_{1}^{T} \mathbf{x} > 0 \land \mathbf{a}_{2}^{T} \mathbf{x} >0]$. A proof or rigorous explanation will be most welcome.

StubbornAtom
  • 17,052
TheDon
  • 919
  • Thank you for your reply. Why do you say exactly $\frac{1}{4}$? Btw, I somehow found the answer but I'd like to understand the meaning. I'll post it now. – TheDon May 06 '20 at 16:34
  • 1
    I see, but X and Y are not independent, as they are linear transformations of the same random vector $\mathbf{x}$. – TheDon May 06 '20 at 16:54
  • 1
    Yes, sorry I somehow miscalculated their covariance. Since $(a_1^Tx,a_2^Tx)$ is jointly normal with correlation $\rho=\frac{a_1^Ta_2}{\sqrt{a_1^Ta_1}\sqrt{a_2^Ta_2}}$, you have $P(a_1^Tx>0,a_2^Tx>0)=\frac14+\frac1{2\pi}\sin^{-1}\rho$ by this result. – StubbornAtom May 06 '20 at 17:13
  • 1
    As you can see, in effect you only need this one probability. Because $(a_1^Tx,a_2^Tx)$ has the same distribution as $(-a_1^Tx,-a_2^Tx)$, and $(a_1^Tx,-a_2^Tx)$ has the same distribution as $(-a_1^Tx,a_2^Tx)$, you have

    $$E[\operatorname{sgn}(a_1^Tx)\operatorname{sgn}(a_2^Tx)]=2P(a_1^Tx>0,a_2^Tx>0)-2P(a_1^Tx>0,-a_2^Tx>0)$$

    Since $(a_1^Tx,-a_2^Tx)$ is jointly normal with correlation $-\rho$, the entire expression simplifies to $\frac2{\pi}\sin^{-1}\rho$. This is also answered at https://math.stackexchange.com/q/3058888/321264.

    – StubbornAtom May 06 '20 at 17:45
  • These were very useful, thank you. If you care to write an answer, I'll accept it. Also, how would you extend this to more than 2 correlated RVs? For instance, I'd like to derive a similar expression for $\mathbb{P}[\mathbf{a}{1}^{T} \mathbf{x} > 0 \land \mathbf{a}{2}^{T} \mathbf{x} >0 \land \mathbf{a}_{3}^{T} \mathbf{x} >0]$. – TheDon May 08 '20 at 13:22

1 Answers1

0

Suppose $X\sim N_n(0,I_n)$ and $U=a_1^TX$, $V=a_2^TX$, $W=a_3^TX$ for vectors $a_1,a_2,a_3\in \mathbb R^n$.

Then $(U,V)$ is bivariate normal with mean vector $0$, $\mathbb{Var}(U)=\lVert a_1 \rVert^2$, $\mathbb{Var}(V)=\lVert a_2 \rVert^2$ and $\mathbb{Cov}(U,V)=a_1^Ta_2$, i.e. with correlation $\rho_{U,V}=\frac{a_1^Ta_2}{\lVert a_1 \rVert \lVert a_2 \rVert}$.

The main result here is this one, which says

$$\mathbb P\left[U>0,V>0\right]=\mathbb P\left[\frac{U}{\lVert a_1 \rVert}>0,\frac{V}{\lVert a_2 \rVert}>0\right]=\frac14+\frac1{2\pi}\arcsin(\rho_{U,V})\tag{1}$$

Due to symmetry, observe that $(U,V)\stackrel{d}= (-U,-V)$ and $(-U,V)\stackrel{d}=(U,-V)$.

Now $\mathbb E\left[\operatorname{sgn}(U)\operatorname{sgn}(V)\right]$ equals

$$\mathbb P[U>0,V>0]+\mathbb P[-U>0,-V>0]-\mathbb P[-U>0,V>0]-\mathbb P[U>0,-V>0]\,,$$

which reduces to $$\mathbb E\left[\operatorname{sgn}(U)\operatorname{sgn}(V)\right]=2\mathbb P\left[U>0,V>0\right]-2\mathbb P\left[-U>0,V>0\right]\,.$$

As $(-U,V)$ is bivariate normal with correlation $-\rho_{U,V}$, we only need $(1)$ to conclude

$$\mathbb E\left[\operatorname{sgn}(U)\operatorname{sgn}(V)\right]=\frac2{\pi}\arcsin(\rho_{U,V})=\frac2{\pi}\arcsin\left(\frac{a_1^Ta_2}{\lVert a_1 \rVert \lVert a_2 \rVert}\right)\,.$$


Again $(U,V,W)$ is trivariate normal, so using this extension to three dimensions we have

\begin{align} \mathbb P\left[U>0,V>0,W>0\right]&=\mathbb P\left[\frac{U}{\lVert a_1 \rVert}>0,\frac{V}{\lVert a_2 \rVert}>0,\frac{W}{\lVert a_3 \rVert}>0\right] \\&=\frac18+\frac1{4\pi}\left(\arcsin(\rho_{U,V})+\arcsin(\rho_{V,W})+\arcsin(\rho_{U,W})\right)\,. \end{align}

StubbornAtom
  • 17,052
  • Thank you for your accurate answer! – TheDon May 11 '20 at 14:24
  • I'm now trying to extend this to 4 variables, but the approach in your second link doesn't seem to work. In fact, if we let $p=P[X_{1}>0,X_{2}>0,X_{3}>0,X_{4}>0]=P[X_{1}<0,X_{2}<0,X_{3}<0,X_{4}<0]$, we have $1-p=P[{X_{1}>0} \cup {X_{2}>0} \cup {X_{3}>0} \cup {X_{4}>0}]=P[X_{1}>0]+P[X_{2}>0]+P[X_{3}>0]+P[X_{4}>0]-P[X_{1}>0,X_{2}>0]-P[X_{1}>0,X_{3}>0]-P[X_{1}>0,X_{4}>0]-P[X_{2}>0,X_{3}>0]-P[X_{2}>0,X_{4}>0]-P[X_{3}>0,X_{4}>0]+P[X_{1}>0,X_{2}>0,X_{3}>0]+P[X_{1}>0,X_{2}>0,X_{4}>0]+P[X_{1}>0,X_{3}>0,X_{4}>0]+P[X_{2}>0,X_{3}>0,X_{4}>0]-p$ and, clearly, $p$ disappears from both sides... – TheDon May 11 '20 at 14:25
  • 1
    There is no general formula in higher dimensions unless in special cases, as far as I am aware. These are called orthant probabilities. For details you can refer to these books: https://link.springer.com/book/10.1007/978-1-4613-9655-0, https://onlinelibrary.wiley.com/doi/book/10.1002/0471722065. – StubbornAtom May 11 '20 at 16:04
  • I see, thanks for the details. Do you think there's any hope to obtain $\mathbb{E}[\mathrm{sgn}(X_{1}) \mathrm{sgn}(X_{2}) \mathrm{sgn}(X_{3}) \mathrm{sgn}(X_{4})]$ without the individual orthant probabilities? – TheDon May 12 '20 at 13:48
  • I'm still dealing with the case of 4 variables, i.e., I need to derive a solution for $\mathbb{P}[X_1>0,X_2>0,X_3>0,X_4>0]$. Although there's no general expression, I have a particular case where each variable is correlated with only two other variables, i.e., $\rho_{13}=\rho_{24}=0$. I have the feeling that a solution can be found in this case by exploiting your expression above for 3 variables. I would be very grateful if you could help me – TheDon Jun 11 '20 at 19:46
  • You should ask this as a separate question. Maybe others could help. By the way, the 3D case was also mentioned here: https://math.stackexchange.com/q/379378/321264. – StubbornAtom Jun 11 '20 at 19:59
  • Thank you for the additional reference. I posted a new question here: https://math.stackexchange.com/q/3716441/53733 – TheDon Jun 12 '20 at 07:20