According to this page in "Encyclopedia in Mathematics", the Borel's large number theorem can be stated as below.
"Consider independent random variables $X_1,\dots,X_n,\dots$ which are identically distributed and assume one of two values 0 and 1 with probability of 1/2 each; the expression $S_n=\sum_{k=1}^n{X_k}$ will then give the number of successful trials in a Bernoulli scheme in which the probability of success is 1/2. Borel [B] showed that
$ \frac{S_n}{n} \to \frac{1}{2} $
with probability one as $n \to \infty$. "
Question 1:
Does the last sentence in the above statement mean (precisely) the sequence $\frac{S_n}{n}$ converges "almost surely" to $\frac{1}{2}$ (with probability 1) as the following?
$P( \lim\limits_{n \to \infty}{ \frac{S_n}{n} } = \frac{1}{2} ) = 1 $
namely,
$ P(\forall \epsilon \gt 0, \exists \; n_0 \in \mathbb{N} \; s.t. \; | \frac{S_n}{n} - \frac{1}{2} | \lt \epsilon, \forall n \ge n_0 ) = 1 $
where $P$ denotes the probability.
Question 2:
On the other hand, the definition of "almost sure convergence" doesn't make sense. Because for some event $E$, the notation "$P(E) = 1$" means the event $E$ is certain. So we should be able to drop "$P() = 1$", therefore
$P(E) = 1 \Leftrightarrow E$
which is confusing to me. I can see that, if we had the notation "$\lim\limits_{n \to \infty}{P(E(n)) = 1}$", then we wouldn't be able to drop the "$P() = 1$" inside the limit, since this would be the definition to an asymptotic probability.
I appreciate any help to get me understand these basic concepts.
EDIT NOTE: I have edited and split the text into 2 questions.
==========================
RE-EDIT: adding question 3.
Question 3:
If my understanding in question 1 is accurate (see above), why is the following result by Khinchin the stronger one? From the same page in "Encyclopedia in Mathematics",
$\dots$ after which (1922) the stronger result:
$ P( \limsup\limits_{n \to \infty} { \frac{ |\frac{S_n}{n} - \frac{1}{2}| } { \sqrt{nloglogn} } = \frac{1}{\sqrt{2}} } ) = 1 $
was proved by A.Ya. Khinchin.
My reasoning is, since both theorems state the limit with probability 1 (almost sure convergence), I could drop the notation "$P()$" and only compare the statement inside the "P()". The Borel's theorem states that, $\forall \epsilon \gt 0$ that is the bound of $| \frac{S_n}{n} - \frac{1}{2} |$, as long as $n \ge n_0$ and $n_0$ exists. But the divergent function $f(n) \equiv nloglogn$, in Khinchin's theorem, can be much bigger than the given $\epsilon$ when $n \ge n_0$. So the limit from the above with the divergent function $f(n) \equiv nloglogn$ doesn't seem to be the stronger one at all.