Suppose I have a function $f(\mathbf{x}(t))$, where $\mathbf{x}(t)$ is a vector. Assuming the derivative of $f$ at $t$ exists, I think it can be written, using the chain rule, as
$$\frac{df}{dt} = \frac{df}{d\mathbf{x}^{\mathrm{T}}} \frac{d\mathbf{x}}{dt} = \nabla_{\mathbf{x}} f \frac{d\mathbf{x}}{dt}$$
where $\nabla_{\mathbf{x}} f = \frac{df}{d\mathbf{x}^{\mathrm{T}}}$ is the gradient of $f$ (I've abused the derivative notation a little here but I think this is commonly used in vector calculus so should be ok).
I want to derive this result "rigorously" using limits, but I'm getting stuck. Here's my attempt. Starting with the outer function, $f$ is differentiable at $\mathbf{x}(t)$ if
$$f(x(t + \Delta t)) = f(x(t)) + \frac{df}{d\mathbf{x}^{\mathrm{T}}} \Delta \mathbf{x}(t) + o(||\Delta \mathbf{x}(t)||), \tag{1}\label{eq:1}$$
where $o(||\Delta \mathbf{x}(t)||)$ is a function such that $\lim_{\Delta \mathbf{x} \rightarrow \mathbf{0}} o(||\Delta \mathbf{x}(t)||) / ||\Delta \mathbf{x}(t)|| = 0$ and $\Delta \mathbf{x}(t) = \mathbf{x}(t + \Delta t) - \mathbf{x}(t)$. Now moving on to the inner function, $\mathbf{x}$ is differentiable at $t$ if
$$x(t + \Delta t) = x(t) + \frac{d\mathbf{x}}{dt} \Delta t + o(\Delta t),\tag{2}\label{eq:2}$$
where $o(\Delta t)$ is a function such that $\lim_{\Delta t \rightarrow 0} o(\Delta t) / \Delta t = 0$. This definition means I can also write
$$\Delta \mathbf{x}(t) = \frac{d\mathbf{x}}{dt} \Delta t + o(\Delta t). \tag{3}\label{eq:3}$$
Substituting this into the second term of \eqref{eq:1} I obtain
$$f(x(t + \Delta t)) = f(x(t)) + \frac{df}{d\mathbf{x}^{\mathrm{T}}} \frac{d\mathbf{x}}{dt} \Delta t + \frac{df}{d\mathbf{x}^{\mathrm{T}}} o(\Delta t) + o(||\Delta \mathbf{x}(t)||)\tag{4}\label{eq:4}.$$
If those last two terms simplify to $o(\Delta t)$ (which I'm under the impression they should - e.g. from this post) then I can obtain a limit definition for the derivative of $f(\mathbf{x}(t))$. I can't see how they simplify though. Is it just that they will disappear in the limit as $\Delta t \rightarrow 0$? It's easy to see that $\frac{df}{d\mathbf{x}^{\mathrm{T}}} o(\Delta t) / \Delta t$ will go to zero, but not obvious to me that $o(||\Delta \mathbf{x}(t)||) / \Delta t$ will also go to zero.
I'd love some help and also appreciate feedback on whether my derivation otherwise makes sense and is appropriately written. I'm not a mathematician so these kinds of formal treatment of problems is a bit alien to me. Any tips much appreciated. Thanks.