I have a question regarding the following definition of an ergodic map from my lecture notes.
Let $(X,\mathcal{M},\mu)$ be a probability space. A function $f\in L^p(X,\mu)$ is invariant if and only if $f(x)=f(\varphi(x)),$ $\mu$ almost everywhere. We note that the following conditions are equivalent.
- $f\in L^1(X,\mu)$ invariant $\Longrightarrow$ $f$ constant $\mu$-a.e.
- $f\in L^2(X,\mu)$ invariant $\Longrightarrow$ $f$ constant $\mu$-a.e.
- $S\in \mathcal{M}$ invariant $\Longrightarrow$ $\mu(S)=0\text{ or }\mu(S)=1.$
Here we say that $S\in \mathcal{M}$ is invariant iff $$\mu(\varphi^{-1}(S)\Delta S)=0$$ where $A\Delta B:=(A\setminus B)\cup (B\setminus A).$
To see the equivalence, note that if $f\in L^1(X,\mu)$ is invariant, then all the sets $S_{\lambda}:=\{x\in X\mid f(x)>\lambda\}$ are invariant, so $(3)\Longrightarrow (1).$ It is clear that $(1)\Longrightarrow (2)\Longrightarrow (3).$ A measure-preserving map satisfying the above equivalent conditions is said to be ergodic.
My question: I can see why $(1)\Longrightarrow (2)\Longrightarrow (3),$ but I am a bit confused on the argument for $(3)\Longrightarrow (1).$ In particular, how does one show that $S_{\lambda}:=\{x\in X\mid f(x)>\lambda\}$ are invariant and why does it imply $(3)\Longrightarrow (1)?$
I wasn't able to come up with a satisfactory explanation to convince myself, so any help will be useful.