On the 'wrong proof' of the chain rule

Question

I am looking through an old analysis course that I had and I was pondering a bit about the proof of chain rule (especially the notorious wrong proof that you can give). I'd be happy if someone was willing to verify my reasoning below. I end with an actual question.

Let's start with the following nice result.

Let $f\colon \mathbb{R}\to \mathbb{R}$ be a continuous function which is differentiable on $\mathbb{R}_0$. Assume that $\lim_{x\to 0}f'(x)=L\in \mathbb{R}$. Then $f$ is differentiable in $0$.

Proof: For each $h\neq 0$, the mean value theorem yields a $c_h\in \mathbb{R}$ strictly between $h$ and $0$ such that $f'(c_h)=\frac{f(h)-f(0)}{h}$. Letting $h\to 0$, is it is obvious that $c_h\to 0$ as each $|c_h|<|h|$. Hence $$\lim_{h\to 0}\frac{f(h)-f(0)}{h}=\lim_{h\to 0}f'(c_h)=L.$$ $\square$

Great, let's apply this to the following function: $$\phi\colon \mathbb{R}\to \mathbb{R}:x\mapsto \begin{cases}x^3\sin(\frac{1}{x}) & \mbox{ if }x\neq 0,\\0 & \mbox{ if } x=0.\end{cases}$$ Clearly $\phi$ is differentiable on $\mathbb{R}_0$ and $$\phi'(x)=3x^2\sin(\frac{1}{x})-x^3\cos(\frac{1}{x})\frac{1}{x^2}=3x^2\sin(\frac{1}{x})-x\cos(\frac{1}{x})$$ for all $x\neq 0$. It is straightforward to see that $\lim_{x\to 0}\phi'(x)=0$ and thus the above result yields that $\phi'(0)=0$ (in particular $\phi$ is differentiable on the whole of $\mathbb{R}$).

Now at this point, recall the chain rule.

Let $f,g\colon \mathbb{R}\to \mathbb{R}$ be functions. If $a\in\mathbb{R}$ such that $f'(a)$ and $g'(f(a))$ both exist, then $(g\circ f)'(a)=g'(f(a))f'(a)$.

The obvious argument to try is the following 'wrong proof':

\begin{eqnarray} \lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{x-a} &=& \lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{f(x)-f(a)}\cdot \frac{f(x)-f(a)}{x-a}\\ &=& \lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{f(x)-f(a)}\cdot \lim_{x\to a}\frac{f(x)-f(a)}{x-a}\\ &=& g'(f(a))f'(a). \end{eqnarray} Here we used that $f$ is continuous in $a$ to see that $f(x)\to f(a)$ as $x\to a$. $\triangle$

However, there is an obvious error in the above reasoning. If for example $f$ is a constant function $f(x)=f(a)$ for all $x\in \mathbb{R}$, then $\lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{f(x)-f(a)}=\lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{0}$ is nonsensical!

Having said that, it is also clear that the above proof does work for functions such that $\exists \delta>0:\forall x\in (a-\delta,a+\delta)\setminus \{a\}:f(x)\neq f(a)$. In that case, $f(x)$ does not equal $f(a)$ for $x$ near $a$ (and $x\neq a$). So the above proof only fails for a particular type of function, the easiest of which are constant functions. However, for a constant function $f$, one can calculate $(g\circ f')(a)$ directly and show that it's $0$.

A natural question at this point is to wonder whether there exists a nonconstant function $f$ such that $f$ is differentiable in $a$ and $f(x)=f(a)$ infinitely often for $x$ near $a$. The answer is yes and the function $\phi$ given in the example above (with $a=0$) satisfies these properties. (Also, the wikipedia page of the chain rule gives the function $f(x)=x^2\sin(\frac{1}{x})$ for $x\neq 0$ and $f(0)=0$ as an example, but this function is not differentiable in $0$. As far as I can tell, this is a worse example than just a constant function to pinpoint the failure of the 'wrong proof'. Perhaps this should be changed?)

In general let $f$ be such a function (thus $\forall \delta>0:\exists x\neq a: |x-a|<\delta$ and $f(x)=f(a)$). If $\lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{x-a}$ exists, then we can compute this limit by choosing an appropriate sequence $x_n\to a$. For each $n\geq 1$, there exists an $x_n\neq a$ such that $|x_n-a|<\frac{1}{n}$ and $f(x_n)=f(a)$. It follows that \begin{eqnarray} \lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{x-a}&=&\lim_{n\to \infty}\frac{g\circ f(x_n)-g\circ f(a)}{x_n-a}\\ &=& \lim_{n\to \infty}\frac{g\circ f(a)-g\circ f(a)}{x-a}\\ &=& 0. \end{eqnarray}

This shows that if $f$ is a function for which the 'wrong proof' of the chain rule fails, then $(g\circ f)'(a)=0$. Off course, I was only able to show this under the assumption that $(g\circ f)'(a)$ actually exists (which off course is true as one can actually prove the chain rule). Nonetheless, this begs the question whether there is a more direct way of showing that $(g\circ f)'(a)$ actually exists (and equals zero) if $f$ is a function for which the 'wrong proof' fails. If so, one can actually fix this 'wrong proof' by considering two cases.

$x^2 \sin(1/x)$ is differentiable at 0. You can't prove it using the theorem you started with, but you can prove it by just directly applying the definition of derivative. — Dan Velleman, Jul 27 '21 at 14:15
@DanVelleman : Ah yes you are right. I didn't check whether $x^2\sin(1/x)$ was differentiable at zero (and the wikipedia entry only mentioned continuity). Okay, then this example makes a lot of sense and it's not a $C^1$-function. Good. — Mathematician 42, Jul 28 '21 at 04:48

score 4 · Accepted Answer · answered Jul 27 '21 at 14:31

Actually, the "wrong proof" is not so bad, as the problem can only happen when $f'(a)=0$ . To fix the problem, it suffices to handle two cases: the case $f'(a)\neq 0$ (for which the "wrong proof" goes through) and the exceptional case $f'(a)=0.$ For completeness, one states the Chain Rule again.

Proposition. Let $f,g:{\mathbb R}\rightarrow {\mathbb R}$ be functions. If $a\in {\mathbb R}$ such that $f'(a)$ and $g'(f(a))$ both exist, then $$(g\circ f)'(a)=g'(f(a))f'(a).$$

Case 1. $f'(a)\neq 0.$

In this case, there exists $\delta>0$ such that $$f(x)-f(a)\neq 0$$ if $0<|x-a|<\delta.$ To see this, let $0<\epsilon<\frac{|f'(a)|}2$ be given. Then there exists $\delta>0$ such that $$0<|x-a|<\delta\Rightarrow \left|\frac{f(x)-f(a)}{x-a}-f'(a)\right|<\epsilon$$ $$\Rightarrow \left|\frac{f(x)-f(a)}{x-a}\right|>|f'(a)|-\epsilon>\frac{|f'(a)|}2>0$$ $$\Rightarrow |f(x)-f(a)|\neq 0,$$ as required. This means that as $x\rightarrow a$, the "wrong proof" works.

Case 2. $f'(a)=0.$

In this case, one needs to prove that $(g\circ f)'(a)=0.$ One considers two possibilities:

$$\frac{(g\circ f)(x)-(g\circ f)(a)}{x-a}=\left\{\begin{array}{cc}0&{\rm if~}f(x)=f(a)\\ \frac{g(f(x))-g(f(a))}{f(x)-f(a)}\cdot \frac{f(x)-f(a)}{x-a}&{\rm if~}f(x)\neq f(a).\end{array} \right.$$ It suffices to show that for every $\epsilon>0$, there exists $\delta>0$ such that $$0<|x-a|<\delta\Rightarrow \left|\frac{(g\circ f)(x)-(g\circ f(a)}{x-a}\right|<\epsilon.\qquad (1)$$ It is clear that when $x\rightarrow a$ and $x\neq a$, if $f(x)=f(a)$, the right hand side of (1) gives $0<\epsilon$, which trivially holds. So for the given $\epsilon>0$, one just needs to find $\delta>0$ to address the second possibility: $f(x)\neq f(a)$. Namely, one needs to show in this case that $$\left|\frac{(g\circ f)(x)-(g\circ f)(a)}{x-a}\right|=\left|\frac{g(f(x))-g(f(a))}{f(x)-f(a)}\right|\cdot \left|\frac{f(x)-f(a)}{x-a}\right|<\epsilon.\qquad (2)$$ This is more or less straightforward, but one spells out the details below.

Since $g'(f(a))$ exists (hence bounded), there exists $\epsilon_1>0$ and $M>0$ such that $$0<|f(x)-f(a)|<\epsilon_1\Rightarrow \left|\frac{g(f(x))-g(f(a))}{f(x)-f(a)}\right|\leq M.$$

Since $f'(a)=0,$ there exists $\delta_1>0$ such that $$0<|x-a|<\delta_1\Rightarrow \left|\frac{f(x)-f(a)}{x-a}\right|<\frac{\epsilon}M.$$

Since $f$ is continuous at $a,$ there exists $\delta_2$ such that $$|x-a|<\delta_2\Rightarrow |f(x)-f(a)|<\epsilon_1.$$

Now let $\delta:=\min(\delta_1,\delta_2).$ Then one sees that $$0<|x-a|<\delta\Rightarrow \left|\frac{g(f(x))-g(f(a))}{f(x)-f(a)}\right|\cdot \left|\frac{f(x)-f(a)}{x-a}\right|<M\cdot \frac{\epsilon}M=\epsilon,$$ provided that $f(x)-f(a)\neq 0,$ as required by (2). QED

Perfect, thanks for taking the time to respond. This certainly works and is probably about as short one an make this argument. I agree that the "wrong proof" isn't that bad! — Mathematician 42, Jul 28 '21 at 04:54

On the 'wrong proof' of the chain rule

1 Answers1

Linked