Imagine you have a few gears put together in some mechanism. You have a knob on the first and the last one, so you can turn those individually. If you think of a gear as a function, what a gear does is it takes 'input' from the gear before it, and rotates according to it's own size. Something like $$\text{twist of the (n-1)th gear }\stackrel{\text{nth gear}}\mapsto \text{twist of the nth gear }.$$
So, let's say I twist the first gear by some angle $\mathrm d \theta_1$. How is this related to the angle the last gear will twist? Something like this:
$$\mathrm d\theta_n = \prod_i^n \text{(individual change of the i-th gear)}\times \mathrm d \theta_1$$
And this is the chain rule, on an example that works because the rate of change of a gear is just a ratio of the previous to this one. Which, again, encodes the fact that you can 'cancel out' the infinitesimals to get
$$\frac{df}{dx}=\frac{d f}{dy}\frac{dy}{dx}$$
in a dumb but intuitive way.