In Theorem 5.5, Rudin proves the chain rule, but does so in a somewhat different fashion than expected. It seems we can prove the chain rule more easily.
Theorem: Suppose $f:[a,b]\to\mathbb{R}$ is continuous on $[a,b]$ and $g:I\to\mathbb{R}$ where $I$ is an interval that contains the range of $f$ and $g$ is continuous at $f(x)$. Then if we define $h(t)=g(f(t))$, $t\in[a,b]$, then $h'(x)=g'(f(x))f'(x)$.
Proof: If we write down $g'(f(x))f'(x)$, it looks like we're done if we can just express $g'$ in terms of the same limits as $f'$, i.e., $\lim_{t\to x} g(f(t))=\lim_{y\to f(x)} g(y)$. Showing this seems equivalent to Theorem 4.7, that $h$ is continuous at $x$, but I'll write it for the sake of completeness:
$\lim_{y\to f(x)}g(y)=g(f(x)),$ so $\forall \epsilon>0 \exists\delta>0$ st $\forall y\in I$ st $d(y,f(x))<\delta, d(g(y),g(f(x)))<\epsilon$.
Since $f$ is continuous at $x$, $\exists \eta>0$ st $\forall t\in[a,b]$ st $d(t,x)<\eta,$ $d(f(t),f(x))<\delta$,which means $d(g(f(t)),g(f(x)))<\epsilon$.
So then $\lim_{t\to x} g(f(t)) = \lim_{y\to f(x)}g(y)$.
Then we have $g'(f(x))*f'(x) = \lim_{y\to f(x)}\frac{g(y)-g(f(x))}{y-f(x)}*\lim_{t\to x}\frac{f(t)-f(x)}{t-x}$ $=\lim_{t\to x}\frac{g(f(t))-g(f(x))}{f(t)-f(x)}*\lim_{t\to x}\frac{f(t)-f(x)}{t-x}$ $=\lim_{t\to x}\frac{g(f(t))-g(f(x))}{t-x} = h'(x)$.
Is there an issue to this approach I'm not seeing?