Understanding measurable functions and their definition based on pre-images?

Question

I have recently started learning a bit more about measure theory, but I've been stuck on the definition of measurable functions. I'm comfortable with the formal definition that says a function $f:X\to Y$ is measurable if the pre-image of any measurable set is measurable. What I don't understand is why this definition has been chosen, i.e. the "intuition" on the meaning of being measurable.

I haven't learned about $\sigma$-algebras due to the book I am using, but I'm aware that measurable functions preserve the structure of the measure spaces. In that case, I'm would like to know why pre-images do the trick and not the images of functions. If I wanted to know if $f$ preserved the structure, then my first idea would be to make sure that measurable sets are mapped into/to measurable sets, not to look at pre-images.

Continuity has almost the same definition. However, this comes from the generalization of the $\epsilon$-$\delta$ definition of continuity from analysis/metric spaces. Therefore, I don't think the same rational can be used to explain why we use pre-images to define measurable functions.

I have read through a fair amount of StackExchange answers on the topic, and some responses clarified why this definition is useful. For one, if $Y$ does not have a measure an $X$ has $\mu$, then we can pull-back to get $\mu\circ f$. However, this issue doesn't arise when both spaces are measurable. The second thread that helped explained that being measurable is necessary for the Lebesgue integral.

Taken together is that all there is to it? Is this defined so that we can pull-back real functions to properly define Lebesgue integration? Any sort of insights or alternative perspectives would be welcome.

Maybe it is because we used to only have the Riemann integral which was only defined for piecewise continuous functions. When developing measure theory, we wanted to expand the integrable function to more than just the continuous ones. Using the definition of the preimage, makes continuous functions measurable, and it resembles the definition of continuous functions all the same. But this is not a complete answer — Norse, Jan 08 '20 at 00:45
If you haven't learned about $\sigma$-algebras, then that is your first step towards understanding the definition of measurable functions. Folland's Real Analysis gives a good treatment of this material. — Math1000, Jan 08 '20 at 01:22

score 2 · Accepted Answer · answered Jan 08 '20 at 03:29

The best intuition might come from the applications of measure theory to probability. In probability theory, you take a measure space $(\Omega, \mathcal{A}, P)$ such that $P(\Omega) = 1$. You can think of $\Omega$ as the set of all possible worlds. $P$ is a probability measure that specifies the probability of any measurable subset of possible worlds.

A random variable is then defined as a measurable function $X : \Omega \rightarrow \mathbb{R}$. That is: as an argument, it takes whatever possible world is the case, and tells us one number about the world.

For simplicity, think of it as a coin-flip. So, there's some set of possible worlds $A \in \mathcal{A}$ such that $X(\omega) = 1$ for all $\omega \in A$; this is all the possible worlds where the coin lands heads. Then $A^c$ is the set of all possible worlds where the coin lands tails.

Now, we want to talk about the probability this coin lands heads. However, in our construction of probability, we only really have a probability measure on $\Omega$. How do we state the probability that the coin landed heads? We look at $P X^{-1}(A)$.

This is why you'd want the inverse images to be measurable: you want to define probability distributions of random variables, and you do so based on the probability measure on this underlying probability space $\Omega$.

Hopefully that provides some intuition!

That makes a lot of sense, thank you! Connecting the definition back to probability was really useful for me. The book I'm using is for functional analysis, but I've taken probability so your explanation made sense to me. I've used the fact that $P(X=a) = P({\omega\in \Omega | X(\omega) =a})$ all the time, but it didn't click to me that an event is the pre-image of a random variable. Of course we want the pre-image of random variables to be measurable! The motivation behind this definition makes sense to me, thank you! — Noah M, Jan 08 '20 at 19:14

Understanding measurable functions and their definition based on pre-images?

1 Answers1

Linked