3

I looked up a proof of the chain rule here https://web.williams.edu/Mathematics/lg5/A37W12/Chain.pdf

which made sense from a computational standpoint. However, I do not understand what intuitive details about the process of the chain rule or why it works, from this proof.

  • Relevant thread: https://math.stackexchange.com/questions/725951/intuition-behind-chain-rule?rq=1 – littleO Sep 20 '17 at 22:35
  • Essentially you have to intuitively understand that $$\frac{{\rm d}y}{{\rm d}x} = \frac{ {\rm d}y/{\rm d}t }{{\rm d}x/{\rm d}t} = \frac{\dot{y}}{\dot{x}}$$ – John Alexiou Sep 20 '17 at 23:20
  • Please, if you are ok, you can accept the answer and set it as solved. Thanks! – user Feb 03 '18 at 23:52

4 Answers4

5

Let $h(x) = f(g(x))$. Intuitively, $h'(x)$ is a measure of how much $h$ changes per unit $x$ for a small change in $x$:

A small change $\delta x$ in $x$ causes the change $g'(x) \delta x$ in the value of g(x). The change in the value of $h(x) = f(g(x))$ as a result of the change in $g(x)$ is then given by $f'(g(x))$ multiplied by the change in $g(x)$ which was $g'(x) \delta x$. So the total change in $h$ is given by $f'(g(x))g'(x) \delta x$.

Divide both sides by $\delta x$ and take the limit of $\delta x \to 0$ (where the approximations become exact) and you have your chain rule: $h'(x) = f'(g(x)) g'(x)$.

stochastic
  • 2,560
3

If Alice can run twice as fast as Bob, then $dA/dB = 2$. If Bob can run 3 times as fast as Carl, then $dB/dC = 3$. How much faster can Alice run than Carl?

$$\frac{dA}{dC} = \frac{dA}{dB}\frac{dB}{dC} = 2\cdot 3 = 6,$$

as your intuition says.

2

Here is a not rigorous but I hope intuitive explanation of chain rule.

Let

$$h(x) = f(g(x))$$

then

$$\Delta h\approx f'(g(x))\Delta g$$

$$\Delta g\approx g'(x)\Delta x$$

thus

$$\Delta h\approx f'(g(x))g'(x)\Delta x\implies\frac {\Delta h}{\Delta x}\approx f'(g(x))g'(x)\implies \frac{d h}{dx}=f'(g(x))g'(x)=\frac{df}{dg}\frac{dg}{dx}$$

For functions of several variables it works almost at the same way but with gradients and jacobians instead of the derivatives.

user
  • 154,566
0

Consider two magnifiers. The first one magnifies an object $5$ times and the second one magnifies an object $3$ times. Now take an object and compose a device with your magnifiers such that you can see the image of the image of the object through the two magnifiers. An object will be magnified $15$ times, that is the rate of change of the composite magnification is the product of rates of changes. That is what mathematicians call the chain rule.$$\frac {dw}{dx} = \frac {dw}{du}\frac {du}{dx}$$