Variable Selection for Bayesian Linear Model

Question

Consider the Bayesian linear model $y_i\sim N(x_i\beta,\sigma^2 ), i=1,\ldots,n$ where $$\sum_{i=1}^n x_i=0, \sum_{i=1}^n x_i^2 =1, \sum_{i=1}^n x_i y_i=\gamma $$ The prior for $\beta$ and the dummy variable $z$ is given by $$\pi (\beta \mid z)=(1-z)\delta_0(\beta ) + zN(\beta \mid 0,\tau ^{2}) $$$$\pi (z)=q^z(1-q)^{1-z}$$ Suppose $\sigma , \tau , q$ are all known and $\delta_0(\beta )$ is the indicator function which is 1 when $\beta=0$ and is $0$ otherwise. So, $\beta=0$ if $z=0$ and $\beta =N(0,\tau^2)$ if $z=1$.

(1) Find $p(y_1,\ldots,y_n\mid z=0)$

(2) By integrating out $\beta$, Find $p(y_{1},...,y_{n}\mid z=1)$

(3) Hence, find $P(z=1\mid y_{1},...,y_{n})$

(4) What is $P(z=1\mid y_{1},...,y_{n})$ when $\gamma =0$ and with large $n$

(5) Under the condition in (4), give an intuitive explanation why $P(z=1\mid y_{1},...,y_{n})$ takes this value when $n → ∞$.

=======================================================================================

For the (1), my answer is $N(0,\sigma^2)(1-q)$

For the (2), my answer is $N(0,\tau^2)q$ - I am not sure about this one.

I have trouble to get the answers from (3) to (5), please help me to solve this problems and any input will be grateful and I really appreciate your time and help!

Writing \sigma ^{^{2}} instead of \sigma^2 in MathJax code is very strange (and causes $\sigma ^{^{2}}$ to appear instead of $\sigma^2$). I suspect you're using one of those software packages that writes the code for you. Those often produce code that looks like something written by a lunatic. — Michael Hardy, Jun 30 '21 at 17:31

Thomas · Answer 1 · 2021-06-30T19:45:52.707

The exercise requires some calculations.

Try using these hints and see/tell us where you find issues:

(1) When $z=0$, $\beta=0$ almost surely, so that :

$p(y_1,...,y_n|z=0)=p(y_1,...,y_n|\beta=0)=\prod_i N(0,\sigma^2)(y_i)$

so there is not the factor $1-q$ that you propose. The presence of a factor by the way would make also the density unnormalized. The notation $N(\mu,\sigma^2)(x)$ denotes the gaussian density evaluated at x.

(2) $p(y_1,...,y_n|z=1)=\int d \beta p(y_1,...,y_n|\beta) p(\beta|z=1)$,

where you have to compute the Gaussian integral analytically end the densities on the left you know..

(3) Using Bayes:

$p(z=1∣y_1,...,y_n)=p(y_1,...,y_n|z=1)\frac{p(z=1)}{p(y_1,...,y_n|z=1)p(z=1)+p(y_1,...,y_n|z=0)p(z=0)}$

and now you can plug in results of previous points into this expression.

Once you arrive here with the calculations, I guess that then you can tackle points 4 and 5.

score 0 · Answer 2 · edited Jul 01 '21 at 07:20

COMMENT: $N(0,\tau^2)$ is a function of $\beta$ ( can you write it explicetly in the exponential form ? Remember that $0,\tau^2$ are the parameters of the distribution, and $\beta$ in this case is where you evaluate it). Think of it from the first equation you wrote. It is the conditional distribution over $\beta$ when $z=1$ and therefore depends on $\beta$. That is why you cannot take it out of the integral, but have to perform the integral explicitly.

Since $y\sim N(\beta ,\sigma ^2)$, will $\int p(y|\beta )d\beta=1?$ if so, the answer for (2) will be $\prod _{i}N(0,\tau^2)$ ?

COMMENT: The normalization does not work like that. It is $\int p(y|\beta )dy=1$ the right, albeit not very useful here, normalization condition. Further, since from the previous comment the density is evaluated at $\beta$, you have no $y$ dependence on your answer, which therefore does not make much sense even from a qualitative point of view (i.e. the answer cannot be right).

I added some comments in bold in your answer, I hope it is ok for you. If you find my comments useful and that the post requires a fix, remove my comments and apply the fix. Otherwise you can also remove them directly ;) — Thomas, Jul 01 '21 at 07:11

Variable Selection for Bayesian Linear Model

2 Answers2