Baby Rudin Theorem 5.5 (Chain rule): What's wrong with the obvious proof?

Question

In Theorem 5.5, Rudin proves the chain rule, but does so in a somewhat different fashion than expected. It seems we can prove the chain rule more easily.

Theorem: Suppose $f:[a,b]\to\mathbb{R}$ is continuous on $[a,b]$ and $g:I\to\mathbb{R}$ where $I$ is an interval that contains the range of $f$ and $g$ is continuous at $f(x)$. Then if we define $h(t)=g(f(t))$, $t\in[a,b]$, then $h'(x)=g'(f(x))f'(x)$.

Proof: If we write down $g'(f(x))f'(x)$, it looks like we're done if we can just express $g'$ in terms of the same limits as $f'$, i.e., $\lim_{t\to x} g(f(t))=\lim_{y\to f(x)} g(y)$. Showing this seems equivalent to Theorem 4.7, that $h$ is continuous at $x$, but I'll write it for the sake of completeness:

$\lim_{y\to f(x)}g(y)=g(f(x)),$ so $\forall \epsilon>0 \exists\delta>0$ st $\forall y\in I$ st $d(y,f(x))<\delta, d(g(y),g(f(x)))<\epsilon$.

Since $f$ is continuous at $x$, $\exists \eta>0$ st $\forall t\in[a,b]$ st $d(t,x)<\eta,$ $d(f(t),f(x))<\delta$,which means $d(g(f(t)),g(f(x)))<\epsilon$.

So then $\lim_{t\to x} g(f(t)) = \lim_{y\to f(x)}g(y)$.

Then we have $g'(f(x))*f'(x) = \lim_{y\to f(x)}\frac{g(y)-g(f(x))}{y-f(x)}*\lim_{t\to x}\frac{f(t)-f(x)}{t-x}$ $=\lim_{t\to x}\frac{g(f(t))-g(f(x))}{f(t)-f(x)}*\lim_{t\to x}\frac{f(t)-f(x)}{t-x}$ $=\lim_{t\to x}\frac{g(f(t))-g(f(x))}{t-x} = h'(x)$.

Is there an issue to this approach I'm not seeing?

The issue of potential division by zero has arisen numerous times elsewhere on site, e.g., Trouble understanding chain rule proof; is that the question? — Andrew D. Hwang, Jun 19 '22 at 22:47
Yes, this is a more direct and thorough treatment of the problem. — Leland Stirner, Jun 20 '22 at 17:59

peek-a-boo · Accepted Answer · 2022-12-18T03:28:33.963

Right, the issue is division by zero. To get around it, you can prove it as Rudin does (which can easily be generalized to higher dimensions), or you can define the 'modified difference quotients': $\phi:[a,b]\to\Bbb{R}$, and $\gamma:I\to\Bbb{R}$ as \begin{align} \phi(t):= \begin{cases} \frac{f(t)-f(x)}{t-x}&\text{if $t\neq x$}\\ f'(x)&\text{if $t=x$} \end{cases} \quad\text{and}\quad \gamma(s)&:= \begin{cases} \frac{g(s)-g(f(x))}{s-f(x)}&\text{if $s\neq f(x)$}\\ g'(f(x))& \text{if $s=f(x)$} \end{cases} \end{align} The benefit now is that these guys are defined everywhere, and furthermore, the fact that $f$ is differentiable at $x$, and $g$ is differentiable at $f(x)$ (by hypothesis) implies that the functions $\phi,\gamma$ are continuous at $x$ and $f(x)$ respectively. Now, for any $t\neq x$, we have \begin{align} \frac{(g\circ f)(t)-(g\circ f)(x)}{t-x}&=\gamma(f(t))\cdot \phi(t). \end{align} To prove this, we consider two cases: the first case is if $f(t)-f(x)=0$; in this case, both sides are $0$, and in the second case, $f(t)-f(x)\neq 0$, you can multiply and divide by it. Now, $\gamma,f,\phi$ are continuous at the correct points, so we can take the limit $t\to x$ to get that the limit is $\gamma(f(x))\cdot \phi(x)=g'(f(x))\cdot f'(x)$.

score 2 · Answer 2 · answered Jun 19 '22 at 22:21

2

You have to take care not to divide by zero. Think of functions like $f(x)=x^2\sin(1/x)$ that are differentiable at $0$ and vanish at a sequence tending to $0$.

answered Jun 19 '22 at 22:21

Yuval Peres

21,955

Baby Rudin Theorem 5.5 (Chain rule): What's wrong with the obvious proof?

2 Answers2

Linked