I am stuck with a small issue (compared to others 1, 2 I am still stuck with) in generalizing CI for a good (i.e normally distributed) sampling distribution
If my normal approximation is described by $x$ (with sample mean $\overline{X}$ and sample SD $S = \frac{\sigma}{\sqrt{n}}$), I have,
$$ Pr(x-1.96\dfrac{\sigma}{\sqrt{n}} \leq \mu \leq x + 1.96\dfrac{\sigma}{\sqrt{n}}) = 0.95 \tag{1} $$
Now, the probability area 0.025 in standard normal distribution Z, corresponds to $Z = -1.96$
So $$ Z_{0.025} = -1.96 \\ Z_{\frac{0.05}{2}} = -1.96 $$
With $\alpha$ as significance level, $1-\alpha = 0.95$, then $\alpha = 0.05$, So
$$ Z_{\frac{\alpha}{2}} = -1.96 \tag{2} $$
Substituting $1$ in $2$, we get
$$ Pr(x+Z_{\frac{\alpha}{2}}\dfrac{\sigma}{\sqrt{n}} \leq \mu \leq x - Z_{\frac{\alpha}{2}}\dfrac{\sigma}{\sqrt{n}}) = 0.95 \tag{3} $$
But books define the other way. For eg, in "Probability and Statistical Inference" by Hogg $et. al$, it is defined as below (page 310)
$$ Pr(x-Z_{\frac{\alpha}{2}}\dfrac{\sigma}{\sqrt{n}} \leq \mu \leq x + Z_{\frac{\alpha}{2}}\dfrac{\sigma}{\sqrt{n}}) = 0.95 \tag{4} $$
which implies
$$ Z_{\frac{\alpha}{2}} = 1.96 \tag{5} $$
which is not true. The probability area covered at $Z = 1.96$ is $0.975$, That is,
$$ Z_{1-\frac{\alpha}{2}} = 1.96 \tag{6} $$
So what am I missing?
My take for now:
Somehow unanimously the book authors (and many others), assume, the normal curve area for right tail, the alternate convention of how to read area. Like here. Then $Z_{0.025} = 1.96$ and all fall in place I suppose. I have been used reading area with left tail probabilities so far which I thought was the convention.