I'm sure there is a historically accurate answer for why the pioneers of the theory developed things the way they did, but I will give you my subjective answer, together with some intuition about differential geometry in general.
The whole idea behind manifolds is that we want to pretend, at least locally, that we are working in $\mathbb{R}^n$. Working in general abstract spaces is hard, working in $\mathbb{R}^n$ is easier. Everything else is an attempt to make this rigorous and easy to work with. The chart $(U,\phi)$ is just a way of saying "take that (local) piece of your manifold, and deform it (homeomorphically, difeomorphically, any cally that preserves the structure that you are interested in) into $\mathbb{R}^n$ or some local piece of it". This allows you to say "I don't know how to work in a general weird space, but at least now I can turn it into $\mathbb{R}^n$, work my magic in $\mathbb{R}^n$, and then go back." This is EXACTLY analogous to a choice of basis in linear algebra, except sometimes we can only do it locally, not globally.
Now, to answer your question about transition maps, those play EXACTLY the same role in differential geometry as change of basis matrices do in linear algebra. Recall we said that a chart $(U,\phi)$ is simply a (structure preserving) way of turning this piece $(U)$ of my manifold into $\mathbb{R}^n$. Now, for whatever reason, you might be interested in mathematical objects that locally "look the same". By that I mean, if I do some math on $U$, I would like to be able to translate it into math on $V$, exactly how if I chose a basis $B$ for my vector space, I would like to translate my result to another choice of basis $B'$. The transition maps tell you exactly how to achieve this. In fact, they don't even need to map the manifold to itself, they can be maps between distinct manifolds. Why do we care? Because it turns out that it is important for some applications. Same as why we care about change of basis matrices. For other applications, for example some parts of algebraic geometry, we don't care about translating our math from one local piece to another, so we don't enforce transition maps.
Here is an example of an application. Let $M$ be the manifold of $2\times 2$ real rotation matrices. Clearly, as a manifold, $M$ is just the circle, as for every rotation matrix, we can associate to it the point on the circle that has the same angle, and vice-versa. Thus, if I have lots of calculations to make,for example if I am designing a graphic heavy game with tons of rotations to compute, it is clearly redundant and inefficient to do them in matrix form. Instead, I compute transition functions, turn whatever matrix I'm working on for this calculation into a single real number, do very easy and efficient math on the real number, and transition back. This example is about computational applications, but there are tons of other analytical reasons too, and you will get more familiar with them as you read more.