2

In Theorem 5.5, Rudin proves the chain rule, but does so in a somewhat different fashion than expected. It seems we can prove the chain rule more easily.

Theorem: Suppose $f:[a,b]\to\mathbb{R}$ is continuous on $[a,b]$ and $g:I\to\mathbb{R}$ where $I$ is an interval that contains the range of $f$ and $g$ is continuous at $f(x)$. Then if we define $h(t)=g(f(t))$, $t\in[a,b]$, then $h'(x)=g'(f(x))f'(x)$.

Proof: If we write down $g'(f(x))f'(x)$, it looks like we're done if we can just express $g'$ in terms of the same limits as $f'$, i.e., $\lim_{t\to x} g(f(t))=\lim_{y\to f(x)} g(y)$. Showing this seems equivalent to Theorem 4.7, that $h$ is continuous at $x$, but I'll write it for the sake of completeness:

$\lim_{y\to f(x)}g(y)=g(f(x)),$ so $\forall \epsilon>0 \exists\delta>0$ st $\forall y\in I$ st $d(y,f(x))<\delta, d(g(y),g(f(x)))<\epsilon$.

Since $f$ is continuous at $x$, $\exists \eta>0$ st $\forall t\in[a,b]$ st $d(t,x)<\eta,$ $d(f(t),f(x))<\delta$,which means $d(g(f(t)),g(f(x)))<\epsilon$.

So then $\lim_{t\to x} g(f(t)) = \lim_{y\to f(x)}g(y)$.

Then we have $g'(f(x))*f'(x) = \lim_{y\to f(x)}\frac{g(y)-g(f(x))}{y-f(x)}*\lim_{t\to x}\frac{f(t)-f(x)}{t-x}$ $=\lim_{t\to x}\frac{g(f(t))-g(f(x))}{f(t)-f(x)}*\lim_{t\to x}\frac{f(t)-f(x)}{t-x}$ $=\lim_{t\to x}\frac{g(f(t))-g(f(x))}{t-x} = h'(x)$.

Is there an issue to this approach I'm not seeing?

2 Answers2

3

Right, the issue is division by zero. To get around it, you can prove it as Rudin does (which can easily be generalized to higher dimensions), or you can define the 'modified difference quotients': $\phi:[a,b]\to\Bbb{R}$, and $\gamma:I\to\Bbb{R}$ as \begin{align} \phi(t):= \begin{cases} \frac{f(t)-f(x)}{t-x}&\text{if $t\neq x$}\\ f'(x)&\text{if $t=x$} \end{cases} \quad\text{and}\quad \gamma(s)&:= \begin{cases} \frac{g(s)-g(f(x))}{s-f(x)}&\text{if $s\neq f(x)$}\\ g'(f(x))& \text{if $s=f(x)$} \end{cases} \end{align} The benefit now is that these guys are defined everywhere, and furthermore, the fact that $f$ is differentiable at $x$, and $g$ is differentiable at $f(x)$ (by hypothesis) implies that the functions $\phi,\gamma$ are continuous at $x$ and $f(x)$ respectively. Now, for any $t\neq x$, we have \begin{align} \frac{(g\circ f)(t)-(g\circ f)(x)}{t-x}&=\gamma(f(t))\cdot \phi(t). \end{align} To prove this, we consider two cases: the first case is if $f(t)-f(x)=0$; in this case, both sides are $0$, and in the second case, $f(t)-f(x)\neq 0$, you can multiply and divide by it. Now, $\gamma,f,\phi$ are continuous at the correct points, so we can take the limit $t\to x$ to get that the limit is $\gamma(f(x))\cdot \phi(x)=g'(f(x))\cdot f'(x)$.

peek-a-boo
  • 55,725
  • 2
  • 45
  • 89
2

You have to take care not to divide by zero. Think of functions like $f(x)=x^2\sin(1/x)$ that are differentiable at $0$ and vanish at a sequence tending to $0$.

Yuval Peres
  • 21,955