I am looking through an old analysis course that I had and I was pondering a bit about the proof of chain rule (especially the notorious wrong proof that you can give). I'd be happy if someone was willing to verify my reasoning below. I end with an actual question.
Let's start with the following nice result.
Let $f\colon \mathbb{R}\to \mathbb{R}$ be a continuous function which is differentiable on $\mathbb{R}_0$. Assume that $\lim_{x\to 0}f'(x)=L\in \mathbb{R}$. Then $f$ is differentiable in $0$.
Proof: For each $h\neq 0$, the mean value theorem yields a $c_h\in \mathbb{R}$ strictly between $h$ and $0$ such that $f'(c_h)=\frac{f(h)-f(0)}{h}$. Letting $h\to 0$, is it is obvious that $c_h\to 0$ as each $|c_h|<|h|$. Hence $$\lim_{h\to 0}\frac{f(h)-f(0)}{h}=\lim_{h\to 0}f'(c_h)=L.$$ $\square$
Great, let's apply this to the following function: $$\phi\colon \mathbb{R}\to \mathbb{R}:x\mapsto \begin{cases}x^3\sin(\frac{1}{x}) & \mbox{ if }x\neq 0,\\0 & \mbox{ if } x=0.\end{cases}$$ Clearly $\phi$ is differentiable on $\mathbb{R}_0$ and $$\phi'(x)=3x^2\sin(\frac{1}{x})-x^3\cos(\frac{1}{x})\frac{1}{x^2}=3x^2\sin(\frac{1}{x})-x\cos(\frac{1}{x})$$ for all $x\neq 0$. It is straightforward to see that $\lim_{x\to 0}\phi'(x)=0$ and thus the above result yields that $\phi'(0)=0$ (in particular $\phi$ is differentiable on the whole of $\mathbb{R}$).
Now at this point, recall the chain rule.
Let $f,g\colon \mathbb{R}\to \mathbb{R}$ be functions. If $a\in\mathbb{R}$ such that $f'(a)$ and $g'(f(a))$ both exist, then $(g\circ f)'(a)=g'(f(a))f'(a)$.
The obvious argument to try is the following 'wrong proof':
\begin{eqnarray} \lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{x-a} &=& \lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{f(x)-f(a)}\cdot \frac{f(x)-f(a)}{x-a}\\ &=& \lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{f(x)-f(a)}\cdot \lim_{x\to a}\frac{f(x)-f(a)}{x-a}\\ &=& g'(f(a))f'(a). \end{eqnarray} Here we used that $f$ is continuous in $a$ to see that $f(x)\to f(a)$ as $x\to a$. $\triangle$
However, there is an obvious error in the above reasoning. If for example $f$ is a constant function $f(x)=f(a)$ for all $x\in \mathbb{R}$, then $\lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{f(x)-f(a)}=\lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{0}$ is nonsensical!
Having said that, it is also clear that the above proof does work for functions such that $\exists \delta>0:\forall x\in (a-\delta,a+\delta)\setminus \{a\}:f(x)\neq f(a)$. In that case, $f(x)$ does not equal $f(a)$ for $x$ near $a$ (and $x\neq a$). So the above proof only fails for a particular type of function, the easiest of which are constant functions. However, for a constant function $f$, one can calculate $(g\circ f')(a)$ directly and show that it's $0$.
A natural question at this point is to wonder whether there exists a nonconstant function $f$ such that $f$ is differentiable in $a$ and $f(x)=f(a)$ infinitely often for $x$ near $a$. The answer is yes and the function $\phi$ given in the example above (with $a=0$) satisfies these properties. (Also, the wikipedia page of the chain rule gives the function $f(x)=x^2\sin(\frac{1}{x})$ for $x\neq 0$ and $f(0)=0$ as an example, but this function is not differentiable in $0$. As far as I can tell, this is a worse example than just a constant function to pinpoint the failure of the 'wrong proof'. Perhaps this should be changed?)
In general let $f$ be such a function (thus $\forall \delta>0:\exists x\neq a: |x-a|<\delta$ and $f(x)=f(a)$). If $\lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{x-a}$ exists, then we can compute this limit by choosing an appropriate sequence $x_n\to a$. For each $n\geq 1$, there exists an $x_n\neq a$ such that $|x_n-a|<\frac{1}{n}$ and $f(x_n)=f(a)$. It follows that \begin{eqnarray} \lim_{x\to a}\frac{g\circ f(x)-g\circ f(a)}{x-a}&=&\lim_{n\to \infty}\frac{g\circ f(x_n)-g\circ f(a)}{x_n-a}\\ &=& \lim_{n\to \infty}\frac{g\circ f(a)-g\circ f(a)}{x-a}\\ &=& 0. \end{eqnarray}
This shows that if $f$ is a function for which the 'wrong proof' of the chain rule fails, then $(g\circ f)'(a)=0$. Off course, I was only able to show this under the assumption that $(g\circ f)'(a)$ actually exists (which off course is true as one can actually prove the chain rule). Nonetheless, this begs the question whether there is a more direct way of showing that $(g\circ f)'(a)$ actually exists (and equals zero) if $f$ is a function for which the 'wrong proof' fails. If so, one can actually fix this 'wrong proof' by considering two cases.