I have found an elegant proof of monotone convergence theorem here. I represent the proof below. It's so simple that I'm afraid if I miss some subtle detail. Could you please check if my understanding is fine?
Let $( X_{n} )$ be a sequence of non-negative random variables such that $X_n \nearrow X$ a.s. Then $\mathbb{E}\left(X_{n}\right) \nearrow \mathbb{E}\left( X \right)$.
Proof 1: For all $n$, there exists a sequence $(X_{n,k})_k$ of non-negative simple random variables such that $X_{n, k} \nearrow X_n$ a.s. as $k \to \infty$. Let $$Y_{k} = \max_{n \le k} X_{n, k}, \quad k \in \mathbb N.$$
Clearly, we have
$X_{n,k} \le Y_k \le X_k \le X$ a.s. for all $n \le k$ $(\star)$.
$\mathbb E (Y_k) \le \mathbb E (X_k) \le \mathbb E (X)$ for all $k$ $(\star \star)$.
$(Y_k)$ is a non-decreasing sequence of simple random variables.
Let $Y=\lim_{k \to \infty} Y_k$. Then $\mathbb E (Y) := \lim_{k \to \infty} \mathbb E (Y_k)$ by definition.
Taking the limit $k \to \infty$ in $(\star)$, we get $$\lim_{k \to \infty} X_{n, k} \le \lim_{k \to \infty} Y_k \le \lim_{k \to \infty} X_k \le X \text{ a.s.}, \quad n \in \mathbb N,$$ and consequently $$X_{n} \le Y \le X \text{ a.s.}, \quad n \in \mathbb N \quad (\star \star \star).$$
Taking the limit $n \to \infty$ in $(\star\star\star)$, we get $X=Y$ a.s. and thus $\mathbb E(X) = \mathbb E(Y)$.
Taking the limit $n \to \infty$ in $(\star\star)$, we get $\mathbb E(Y) \le \lim_{k \to \infty} \mathbb E (X_k) \le \mathbb E (X)$. This completes the proof.
Proof 2: Clearly, $\mathbb{E}\left(X_{n}\right) \le \mathbb{E}(X)$ for all $n$ and thus $$\lim_n \mathbb{E}\left(X_{n}\right) \le \mathbb{E}(X).$$
Take $\lambda >1$ and a non-negative simple random variable $Y$ such that $Y \le X$. Define $A_n = \{\omega \mid \lambda X_n(\omega) \ge Y\}$. It follows from $X_{n} \nearrow X$ that $A_{n} \nearrow A:= \bigcup A_n$.
Let's show that $\mathbb P (A) = 1$. Given $\omega$ such that $X_n (\omega) \to X (\omega)$ as $n \to \infty$,
If $X(\omega) = 0$, then $X_n (\omega) = Y (\omega) =0$ and thus $\omega \in A_n$ for all $n$.
If $X(\omega) > 0$, then $\varepsilon := \lambda X (\omega) - Y (\omega) > 0$ and there exists $m$ such that $X (\omega) - X_m (\omega)\le \varepsilon$. Then $\lambda X_m(\omega) \ge \lambda X(\omega) - \varepsilon = Y (\omega)$. As such, $\omega \in A_m$.
We also have $\lambda X_n \ge Y 1_{A_n}$ and thus $\lambda \mathbb E(X_n) \ge \mathbb E (Y 1_{A_n})$ for all $n$. As such, $$\lambda \lim_n \mathbb E(X_n) \ge \lim_n \mathbb E (Y 1_{A_n}) \overset{(\star)}{=} \mathbb E (Y \lim_n 1_{A_n}) = \mathbb E(Y 1_A) \overset{(\star\star)}{=} \mathbb E(Y),$$
where
$(\star)$ is because $Y, 1_{A_n}$ and hence $Y 1_{A_n}$ are simple.
$(\star\star)$ is because $\mathbb P (A)=1$.
Take the limit $\lambda \searrow 1$, we obtain $$\lim_n \mathbb E(X_n) \ge \mathbb E(Y).$$
This inequality holds for any a non-negative simple random variable $Y$ such that $Y \le X$. Then $$\lim_n \mathbb E(X_n) \ge \mathbb E (X).$$ This completes the proof.
Update: After a while, I have found that these two proofs also hold in a more general setting of measure theory. We just need to replace $\mathbb E [X]$ by $\int f \mathrm d \mu$.