The key intuition, first of all, is that the product of two tiny differences is negligible. You can intuit this just by doing computations:
$$3.000001 \cdot 2.0001 = 6.0003020001$$
If we are doing any sort of rounding of hand computations, we'd likely round away that $0.0000000001$ part. If you were doing computations to eight significant digits, a value $v$ is really a value in a range roughly of $v\left(1 \pm 10^{-8}\right)$ and the error when you multiply $v_1$ by $v_2$ is almost entirely $10^{-8}|v_1v_2|$. The other part of the error is so tiny you'd probably ignore it.
Case: $f(x)=x^2$
Now, consider a square with corners $(0,0), (0,x), (x,0), (x,x)$. Grow $x$ a little bit, and you see the area grows by proportionally by the size of two of the edges, plus a tiny little square. That tiny square is negligible.
This is a little harder to visualize for $x^n$, but it actually works the same way when $n$ is a positive integer, by considering an $n$-dimensional hypercube.
This geometric reason is also why the circumference of a circle is equal to the derivative of its area – if you increase the radius a little, the area is increased by approximately that "little" times the circumference. So the derivative of $\pi r^2$ is the circumference of the circle, $2\pi r$.
It's also a way to understand the product rule. (Or, indeed, FOIL.)
Case: The chain rule
The chain rule is better seen by considering an odd-shaped tub. Let's say that when the volume of the water in a tube is $v$ then the tub is filled to depth $h(v)$. Then assume that we have a hose that, between time $0$ and time $t$, has sent a volume of $v(t)$ water.
At time $t$, what is the rate that the height of the water is increasing?
Well, we know that when the current volume is $v$, then the rate at which the height is increasing is $h'(v)$ times the rate the volume is increasing. And the rate the volume is increasing is $v'(t)$. So the rate the height is increasing is $h'(v(t)) \cdot v'(t)$.
Case: Inverse function
This is the one case where it is obvious from the graph. When you flip the coordinates of a Cartesian plane, a line of slope $m$ gets sent to a line of slope $1/m$. So if $f$ and $g$ are inverse functions, then the slope of $f$ at $(x,f(x))$ is the inverse of the slope of $g$ at $(f(x),x)=(f(x),g(f(x)))$. So $g'(f(x))=1/f'(x)$.
$x^2$ revisited
Another way of dealing with $f(x)=x^2$ is thinking again of area, but thinking of it in terms of units. If we have a square that is $x$ centimeters, and we change that by a small amount, $\Delta x$ centimeters, then the area is $x^2\mathrm{cm}^2$ and it goes to approximately $f(x+\Delta x)-f(x)=f'(x)\Delta x$.
On the other hand, if we measure the square in meters, it has side length $x/100$ meters and area $(x/100)^2$. The change in the side length is $(\Delta x)/100$ meters. So the expected area change is $f'(x/100)\cdot (\Delta x)/100$ square meters. But this difference should be the same, so
$$f'(x)\Delta x = f'(x/100)\cdot\frac{\Delta x}{100}\cdot \left(100^2 \text{m}^2/\text{cm}^2\right) = 100 f'(x/100)$$
More generally, then, we see that $f'(ax)=af'(x)$ when $f(x)=x^2$ by changing units from centimeters to a unit that is $1/a$ centimeters.
So we see that $f'(x)$ is linear, although it doesn't explain why $f'(1)=2$.
If you do the same for $f(x)=x^n$, with units $\mu$ and another unit $\rho$ where $a\rho = \mu$, then you get that the a change in volume when changing by $\Delta x\,\mu$ is $f'(x)\Delta x\,\mu^n$. It is also $f'(ax)\cdot a(\Delta x)\,\rho^n$. Since $\mu/\rho = a$, this means $f'(ax) =a^{n-1}f'(x)$.
Again, we still don't know why $f'(1)=n$, but we know $f'(x)=f'(1)x^{n-1}$.