I have recently started learning a bit more about measure theory, but I've been stuck on the definition of measurable functions. I'm comfortable with the formal definition that says a function $f:X\to Y$ is measurable if the pre-image of any measurable set is measurable. What I don't understand is why this definition has been chosen, i.e. the "intuition" on the meaning of being measurable.
I haven't learned about $\sigma$-algebras due to the book I am using, but I'm aware that measurable functions preserve the structure of the measure spaces. In that case, I'm would like to know why pre-images do the trick and not the images of functions. If I wanted to know if $f$ preserved the structure, then my first idea would be to make sure that measurable sets are mapped into/to measurable sets, not to look at pre-images.
Continuity has almost the same definition. However, this comes from the generalization of the $\epsilon$-$\delta$ definition of continuity from analysis/metric spaces. Therefore, I don't think the same rational can be used to explain why we use pre-images to define measurable functions.
I have read through a fair amount of StackExchange answers on the topic, and some responses clarified why this definition is useful. For one, if $Y$ does not have a measure an $X$ has $\mu$, then we can pull-back to get $\mu\circ f$. However, this issue doesn't arise when both spaces are measurable. The second thread that helped explained that being measurable is necessary for the Lebesgue integral.
Taken together is that all there is to it? Is this defined so that we can pull-back real functions to properly define Lebesgue integration? Any sort of insights or alternative perspectives would be welcome.