12

Edit: I shall try to reformulate my question in order to make it -hopefully- more clear.

Let $X$ be a random variable that follows the $n$-dimensional Gaussian distribution. The probability density of $X$ is given by the following function:

$$ f_X(\mathbf{x;\mathbf{\mu}, \Sigma}) = \frac{1}{(2\pi)^{n/2}|\Sigma|^{1/2}} exp\{ -\frac{1}{2} (\mathbf{x}-\mathbf{\mu})^T \Sigma^{-1} (\mathbf{x}-\mathbf{\mu}) \}, $$

where $\mathbf{x},\mathbf{\mu}\in\Re^n$ and $\Sigma \in S_{++}^{n}$.

In addition, let $\mathcal{H}: \mathbf{x}^T\mathbf{w}=0$ be a hyperplane in the $n$-dimensional Euclidean space $\Re^n$, where $\mathbf{w}\in\Re^n$ and $b\in\Re$. The hyperplane $\mathcal{H}$ defines two half-spaces:

$$ \Omega_{+} = \{\mathbf{x} \in \Re^n | \mathbf{x}^T\mathbf{w}+b \geq 0 \}\\ \Omega_{-} = \{\mathbf{x} \in \Re^n | \mathbf{x}^T\mathbf{w}+b < 0 \} $$

What I would like to find out is what's happening when I try to integrate the above gaussian pdf over the region $\Omega_{+}$ (or $\Omega_{-}$), as it's well-known that -by definition- integrating over the whole $\Re^n$ gives 1. This is the spirit! And this is the reason of the existence if the (normalisation) term $\frac{1}{(2\pi)^{n/2}|\Sigma|^{1/2}}$.

Please correct me if I am wrong in something above.

Next, we could observe that if $b=0$, i.e., the hyperplane crosses the origin, then - due to the symmetry of the gaussian pdf - the integral should be equal to $1/2$. But how can I compute the value of the integral when $b\neq0$? Is there some clever trick to compute how the gaussian integral changes with respect to the bias term $b$?

Thanks in advance and apologies for being kinda "newbie" or not strict enough!... Every correction or helpful comment (either on my question or my notation) would be extremely appreciated!

nullgeppetto
  • 3,006
  • 2
    Rotate it so that you get an integral over a half-space ${ x_n \geqslant c}$. Then you have $n-1$ full Gaussian integrals, and one $\int_c^\infty e^{-\xi^2/2},d\xi$. – Daniel Fischer Nov 08 '13 at 15:50
  • @DanielFischer Thank you very much for your instant answer. I will try it and I will return if I have other questions. It seems right, but I'm afraid that something is gonna go wrong! Thank you anyway! – nullgeppetto Nov 08 '13 at 15:58
  • @DanielFischer, I didn't find the time to look at your approach yet, but because of the fact that I tried something similar, I remember that I stuck at the n-dimensional rotation matrix... Could you please clarify for me what exactly you mean by ``rotate it''? Rotate around what? What is the rotation matrix in n dimensions? Apologies for being newbie... – nullgeppetto Nov 10 '13 at 17:47
  • I made some edits that I think that clarify my original question. This question could be concerned along with the following question http://math.stackexchange.com/questions/562332/rotating-an-n-dimensional-hyperplane -- Thanks again! – nullgeppetto Nov 11 '13 at 12:37
  • What is $S^n_{++}$? At first, I thought it was the sphere in $\mathbb{R}^n$ with positive coordinates, but then you treat $\Sigma^{-1}$ as a matrix in your formula, where $\Sigma\in S^n_{++}$. Is it the space of positive-definite $n$-dimensional matrices? – robjohn Nov 22 '13 at 15:54
  • @robjohn, $S_{++}^{n}$ denotes all the symmetric, positive definite $n\times n$ matrices with real elements. – nullgeppetto Nov 22 '13 at 15:59
  • 1
    What is $b$? I imagine that instead of $0$ in the definitions of $\Omega_+$ and $\Omega_-$, you may have meant to have $b$. – robjohn Nov 22 '13 at 16:05
  • @robjohn, oh sorry, you're right, I'll fix it.. – nullgeppetto Nov 22 '13 at 16:06

2 Answers2

16

To compute the integral

$$P = \frac{1}{(2\pi)^{n/2}\sqrt{\det \Sigma}} \int_{\Omega^+} \exp \left(-\frac12 (\mathbf{x}-\mu)^T \Sigma^{-1} (\mathbf{x}-\mu)\right)\,d\mathbf{x},$$

we will need several coordinate transforms. We start with a translation to get rid of the mean, $\mathbf{y} = \mathbf{x}-\mu$. Then $\Omega^+ = \{\mathbf{x} : \mathbf{x}^T\mathbf{w}+b \geqslant 0\}$ corresponds to $\Omega_1 = \{\mathbf{y} : \mathbf{y}^T\mathbf{w} + (\mu^T\mathbf{w} + b)\geqslant 0\}$, and we have

$$P = \frac{1}{(2\pi)^{n/2}\sqrt{\det \Sigma}} \int_{\Omega_1} \exp \left(-\frac12 \mathbf{y}^T \Sigma^{-1} \mathbf{y}\right)\,d\mathbf{y}.$$

Next, since $\Sigma$ is positive definite symmetric, there is an orthogonal matrix $U$ and a diagonal matrix $D$ with positive diagonal elements such that $\Sigma = U^TDU$. Then you have $\Sigma^{-1} = (U^TDU)^{-1} = U^{-1}D^{-1}(U^T)^{-1} = U^TD^{-1}U$ since $U^T = U^{-1}$. Then let $\mathbf{z} = U\mathbf{y}$, and $\mathbf{w}_1 = U\mathbf{w}$. With $\Omega_2 = \{\mathbf{z} : \mathbf{z}^T\mathbf{w}_1 + (\mu^T\mathbf{w}+b)\geqslant 0\}$, we obtain

$$P = \frac{1}{(2\pi)^{n/2}\sqrt{\det\Sigma}} \int_{\Omega_2} \exp \left(-\frac12 \mathbf{z}^T D^{-1} \mathbf{z} \right)\,d\mathbf{z},$$

since $\lvert \det U\rvert = 1$.

Now we do a bit of rescaling. Let $\mathbf{a} = \sqrt{D^{-1}}\mathbf{z}$, or $\mathbf{z} = \sqrt{D}\mathbf{a}$, $\mathbf{w}_2 = \sqrt{D}\mathbf{w}_1$, and $\Omega_3 = \{ \mathbf{a} : \mathbf{a}^T\mathbf{w}_2 + (\mu^T\mathbf{w}+b)\geqslant 0\}$. Since $\det \sqrt{D} = \sqrt{\det\Sigma}$, that yields

$$P = \frac{1}{(2\pi)^{n/2}} \int_{\Omega_3} \exp \left(-\frac12 \mathbf{a}^T\mathbf{a}\right)\,d\mathbf{a}.$$

Now find an orthogonal matrix $B$ such that $B\mathbf{w}_2 = \lVert\mathbf{w}_2\rVert\cdot e_n$, and let $\mathbf{b} = B\mathbf{a}$, $\Omega_4 = \{\mathbf{b} : \mathbf{b}^T(\lVert \mathbf{w}_2\rVert\cdot e_n) + (\mu^T\mathbf{w}+b)\geqslant 0\} = \{\mathbf{b} : \lVert\mathbf{w}_2\rVert b_n + (\mu^T\mathbf{w}+b)\geqslant 0\}$. That yields

$$P = \frac{1}{(2\pi)^{n/2}} \int_{\Omega_4} \exp\left(-\frac12 \mathbf{b}^T\mathbf{b}\right)\,d\mathbf{b}.$$

Now $\Omega_4 = \mathbf{R}^{n-1}\times [c,\infty)$, with $c = -\dfrac{(\mu^T\mathbf{w}+b)}{\lVert \mathbf{w}_2\rVert}$, and hence

$$P = \frac{1}{\sqrt{2\pi}} \int_c^\infty \exp\left(-\frac12 x^2\right)\,dx,$$

which can be evaluated in terms of the error function.

Daniel Fischer
  • 206,697
  • What can I say?! Thanks a lot! Great answer! Could you tell me if my approach below (using the circulant matrices) is good? You get what I wanted to do with the change of variables, don't you? Thanks again! – nullgeppetto Nov 25 '13 at 21:40
  • I was afraid you'd ask. To be honest, I have no experience with circulant matrices, I can't tell you off-hand if the approach works. I have to read it very carefully to understand what's going on. Don't expect an answer to that soon. (Certainly not today, maybe tomorrow.) – Daniel Fischer Nov 25 '13 at 21:47
  • Whenever you can, of course! I'd just like to know if it's gonna work! If you have something, sometime, please let me know! Thanks again! – nullgeppetto Nov 25 '13 at 21:51
  • It would be great, also, if we could verify the same result! – nullgeppetto Nov 25 '13 at 21:54
  • I have some questions on your solution. The first is the following. During the "rescaling step", why $\sqrt{D}=\sqrt{det\Sigma}$, and why $\sqrt{det\Sigma}$ dissapears from the fraction in front of the integral? Thanks! – nullgeppetto Nov 26 '13 at 20:05
  • 1
    We have $\det\sqrt{D}=\sqrt{\det\Sigma}$ since $\Sigma=U^TDU$, so $\det\Sigma=\det(U^T)\cdot\det D\cdot\det U=(\det U)^2\cdot\det D=\det D$, as $U$ is orthogonal, hence $\det U=\pm1$. Now $D=(\sqrt{D})^2$, whence $\det D=(\det\sqrt{D})^2$, and since everything is positive, $\det\sqrt{D}=\sqrt{\det D}=\sqrt{\det\Sigma}$. Then with the substitution $\mathbf{a}=(\sqrt{D})^{-1}\mathbf{z}$, the transformation formula in the short-cut version says $d\mathbf{z}=\lvert\det\sqrt{D}\rvert\cdot d\mathbf{a}$, and with the above, that cancels the $\sqrt{\det\Sigma}$. – Daniel Fischer Nov 26 '13 at 20:16
  • Dear @DanielFischer, just another question on this. I want to express the above integral solely in terms of $\mathbf{w}$ and $b$, so the norm $||\mathbf{w}_2||$ does not suit me. But unless I am wrong, it holds that $\mathbf{w}_2=D^{1/2}U\mathbf{w}$, so, what does it hold for the norm of $\mathbf{w}_2$? Is this something like $\sqrt{detD}||\mathbf{w}||$? Thanks a lot, once again! – nullgeppetto Nov 26 '13 at 22:56
  • 1
    Yes, we have $\mathbf{w}_2 = \sqrt{D}U\mathbf{w}$, but unless $\Sigma$ is a multiple of the identity, you can't unfortunately express $\lVert\mathbf{w}_2\rVert$ that easily in terms of $\mathbf{w}$. However, $\lVert\mathbf{w}_2\rVert^2 = (\sqrt{D}U\mathbf{w})^T(\sqrt{D}U\mathbf{w}) = \mathbf{w}^TU^T\sqrt{D}\sqrt{D}U\mathbf{w} = \mathbf{w}^T\Sigma\mathbf{w}$, so you can write the norm as $\sqrt{\mathbf{w}^T\Sigma\mathbf{w}}$. – Daniel Fischer Nov 26 '13 at 23:04
  • ah ok, that's convenient too (I assume)! Thanks a lot! – nullgeppetto Nov 26 '13 at 23:08
  • @DanielFischer Can we get an analytic expression of the above integral if all the non-diagonal terms of the variance-covariance matrix are set to 0 and the product of diagonal entries is equal to 1? – Akshay Bansal Nov 20 '16 at 00:40
  • @akshay I'm not sure what you count as "an analytic expression of the above integral". We'll always have something including the error function. If you mean whether we can have an explicit expression for the integral limit $c$ in terms of the components of $\mathbf{w}$ and $\mu$, $b$, and the diagonal elements of $\Sigma$, then yes. We have $c = -\dfrac{\mu^T\mathbf{w} + b}{\sqrt{\mathbf{w}^T\Sigma\mathbf{w}}}$. If $\Sigma$ is already diagonal, the denominator is particularly easy. – Daniel Fischer Nov 20 '16 at 10:31
  • nice answer. anyone have a reference for this? – tea_pea Feb 19 '21 at 13:10
0

There is something that could help... I would like some help on notation or whatever else needed, though!

Let $\mathbf{x}=R^T\mathbf{y}$. Then the hyperplane can be rewritten as follows:

$$ \mathcal{H}: \mathbf{w}^TR^T\mathbf{y}+b=0. $$

We define $R$ to be the circulant matrix generated by a vector $\mathbf{r}=(r_1,\dots,r_n)^T$. Then, I think that the following holds:

$$ \mathbf{w}^TR^T = (R\mathbf{w})^T = (W\mathbf{r})^T = \mathbf{r}^T W^T, $$

where $W$ is the circulant matrix generated by the vectors $\mathbf{w}$. Consequntly, if we demand

$$ \mathbf{r}^T W^T = \mathbf{v}^T, $$

then $\mathbf{r}^T = \mathbf{v}^T(W^T)^{-1}=\mathbf{v}^T(W^{-1})^T$. Thus,

$$ \mathbf{r} = W^{-1}\mathbf{v}. $$

If we set the vector $\mathbf{v}$ to be, for instance, $(1,0,\dots,0)^T$, then the change of variable we made will lead to a hyperplane as follows:

$$ \mathcal{H'}: y_1=c. $$

Am I right so far?

In order to find the matrix $R$, which constructs the change of variable ($\mathbf{x}=R^T\mathbf{y}$), we need to find the vector $\mathbf{r}$, which constructs the circulant matrix $R$. We can find $\mathbf{r}$ can be found from the following:

$$ \mathbf{r} = W^{-1}\mathbf{v}, $$

where $W$ is known and is generated as a circulant matrix from $\mathbf{w}$.

But $W$ is a circulant matrix, so it can be expressed as follows:

$$ W = Q\Lambda Q^{-1}, $$

where $Q$ is the matrix of the eigenvectors of $W$, while $\Lambda$ is the diagonal matrix of the corresponding eigenvalues of $W$. For a circulant matrix

$$ W= \begin{bmatrix} w_0 & w_{n-1} & \dots & w_{2} & w_{1} \\ w_{1} & w_0 & w_{n-1} & & w_{2} \\ \vdots & w_{1} & w_0 & \ddots & \vdots \\ w_{n-2} & & \ddots & \ddots & w_{n-1} \\ w_{n-1} & w_{n-2} & \dots & w_{1} & w_0 \\ \end{bmatrix}, $$

the $j$-th eigenvector is given by

$$ \mathbf{q}_j = (1,\omega_j, \omega_j^2, \dots, \omega_j^{n-1})^T, \: j=1,...,n-1 $$

where $\omega_j = exp\left(\frac{2\pi i j}{n}\right)$ are the $n$-th roots of unity, and $i$ the imaginary unit. Moreover, the corresponding $j$-th eigenvalue is given by $\lambda_j = w_0 + w_{n-1}\omega_j + \dots + c_1\omega_j^{n-1}$, $j=0,\dots,n-1$.

As a result we can find $\mathbf{r}$ as follows:

$$ \mathbf{r} = W^{-1}\mathbf{v} = Q\Lambda^{-1}Q^{-1}\mathbf{v}. $$

Thus, having the vectors $\mathbf{r}$ we can construct the circulant matrix $R$, and using the change of variable $\mathbf{x}=R^T\mathbf{y}$ we can go to a hyperplane of the form $y_n=c$, where the integration of simpler.

Am I right so far? Please give me some feedback! Thanks in advance!

nullgeppetto
  • 3,006