If $V,W$ are Banach spaces (example $\Bbb{R}^n,\Bbb{R}^m$) and $f:V\to W$ is differentiable, and $a\in V$, I shall use $Df_a$ to mean the element of $L(V,W)$ such that $\lVert f(a+h)-f(a)-Df_a(h)\rVert/\lVert h\rVert \to 0$ as $h\to 0$. If the domain $V$ is $\Bbb{R}$, then I'll use $f'(a)$ to mean the limit $\lim\limits_{h\to 0}\dfrac{f(a+h)-f(a)}{h}$. It is easy to verify directly that when the domain is $\Bbb{R}$, $f'(a)$ exists if and only if $Df_a\in L(\Bbb{R},W)$ exists and the relationship between the two is that $f'(a)=Df_a(1)$, i.e the linear map $Df_a:\Bbb{R}\to W$ evaluated on the basis vector $1\in\Bbb{R}$ (i.e what we're doing is using the fact that $W$ and $L(\Bbb{R},W)$ are isomorphic vector spaces, and moreover if you use the operator norm, these are isometrically isomorphic Banach spaces).
Now, let $y:\Bbb{R}\to V$ and $f:V\to W$ be infinitely differentiable functions, and consider the composite $f\circ y:\Bbb{R}\to W$. We now use the chain rule to calculate its derivatives:
\begin{align}
(f\circ y)'(t)&=D(f\circ y)_t[1]\\
&= (Df_{y(t)}\circ Dy_t)[1]\\
&= Df_{y(t)}[Dy_t[1]]\\
&= Df_{y(t)}[y'(t)]\tag{$*$}
\end{align}
Now, how do we continue calculating the derivatives? Well, this looks very much like a product rule type of situation. But product rule is really just a special type of chain rule (see the link); let's consider the "evaluation map" $\omega: L(V,W)\times V\to W$, defined by $(T,v)\mapsto T(v)$, i.e you take a linear map and a vector and just evaluate. This is easily verified to be a bilinear mapping. Using this, we can write
\begin{align}
(f\circ y)'(t)&= \omega\bigg(Df_{y(t)}, y'(t)\bigg) = \omega\bigg((Df\circ y)(t), y'(t)\bigg)
\end{align}
i.e on the RHS we have an expression of the form $\omega(u(t), v(t))$; the derivative of such an expression at $t$ is (see the link for the precise statement) given by "differentiating the first entry, keep the second, plus keep the first and differentiate the second". So, if we do this, then we get
\begin{align}
(f\circ y)''(t)&= \omega\bigg((Df\circ y)'(t), y'(t)\bigg) + \omega\bigg((Df\circ y)(t), y''(t)\bigg)\\
&=\omega\bigg(D(Df\circ y)_t[1], y'(t)\bigg) + \omega\bigg(Df_{y(t)}, y''(t)\bigg)\\
&=\omega\bigg(D^2f_{y(t)}[Dy_t[1]], y'(t)\bigg)+\omega\bigg(Df_{y(t)}, y''(t)\bigg)\\
&=\omega\bigg(D^2f_{y(t)}[y'(t)], y'(t)\bigg)+ \omega\bigg(Df_{y(t)}, y''(t)\bigg)
\end{align}
So, all I did is use the chain rule and the relation between $Dy$ and $y'$ as explained in my first paragraph. Now, let us recall that $\omega$ is nothing but the evaluation map, so
\begin{align}
(f\circ y)''(t)&= \bigg(D^2f_{y(t)}[y'(t)]\bigg)[y'(t)] + Df_{y(t)}[y''(t)]
\end{align}
Here, $D^2f_{y(t)}$ is an element of $L(V,L(V,W))$, which if you recall your linear algebra is isomorphic to the space of bilinear maps $L^2(V,V;W)$. Now, if we abuse notation by denoting $D^2f_{y(t)}$ for both the element of $L(V,L(V,W))$ and its corresponding image in $L^2(V,V;W)$ (in fact $D^2f_{y(t)}$ is symmetric in its two arguments so it is actually in $L^2_{\text{sym}}(V,V;W)$), then the above equation says
\begin{align}
(f\circ y)''(t)&= D^2f_{y(t)}[y'(t), y'(t)] + Df_{y(t)}[y''(t)]\tag{$**$}
\end{align}
With some practice, going from $(*)$ to $(**)$ becomes very easy, and you won't even have to explicitly write down the evaluation maps and the various compositions. Similarly, we can continue differentiating by introducing more evaluation maps; for example this time for the first term you can view it as the evaluation map $L^2_{\text{sym}}(V,V;W) \times V\times V\to W$ defined by $(T,x,y)\mapsto T(x,y)$. This is now a trilinear map, so differentiating this is again a type of product rule (differentiate the first keep the second two, keep the first differentiate the second keep the third, keep the first and second and differentiate the third).
But actually, in this specific case, because of the occurrence of the double evaluation $[y'(t), y'(t)]$, and by symmetry of $D^2f_{y(t)}$, you can immediately pull out a factor of $2$ (like how $\frac{d}{dx}x^2 = 2x$). In any case, I leave the details to you; you should find (after identifying third derivatives with the relevant symmetric trilinear map etc)
\begin{align}
(f\circ y)'''(t)&=D^3f_{y(t)}[y'(t), y'(t), y'(t)] + 2D^2f_{y(t)}[y'(t), y''(t)]\\
&+ D^2f_{y(t)}[y'(t),y''(t)] + Df_{y(t)}[y'''(t)]\\
&= D^3f_{y(t)}[y'(t),y'(t), y'(t)] + 3D^2f_{y(t)}[y'(t), y''(t)] + Df_{y(t)}[y'''(t)]
\end{align}
Or if we shorten notation slightly, and write $(y'(t))^3$ to mean the element $(y'(t),y'(t),y'(t))\in V^3$, then the equation reads
\begin{align}
(f\circ y)'''(t)&= D^3f_{y(t)}[(y'(t))^3] + 3D^2f_{y(t)}[y'(t), y''(t)] + Df_{y(t)}[y'''(t)]\tag{$***$}
\end{align}
We can continue once again (define relevant evaluation maps, use product rule, chain rule, use symmetry of higher derivatives, identify higher derivatives with relevant multilinear maps etc):
\begin{align}
(f\circ y)^{(4)}(t)&= D^4f_{y(t)}[(y'(t))^4] + 3D^3f_{y(t)}[(y'(t))^2, y''(t)]\\
&+ 3\bigg(D^3f_{y(t)}[y'(t), y'(t), y''(t)] + D^2f_{y(t)}[y''(t), y''(t)] +
D^2f_{y(t)}[y'(t), y'''(t)]\bigg)\\
&+ D^2f_{y(t)}[y'(t), y'''(t)] + Df_{y(t)}[y^{(4)}(t)]
\end{align}
Hopefully its clear how I got each term from the previous differentiation, and the notation is clear; for example, $y^{(4)}(t)$ means $y''''(t)$, while in the first line $[(y'(t))^2, y''(t)]$ means the element $[y'(t), y'(t), y''(t)]\in V^3$. If we now combine all the like terms, we get
\begin{align}
(f\circ y)^{(4)}(t)&= D^4f_{y(t)}[(y'(t))^4] + 6 D^3f_{y(t)}[(y'(t))^2, y''(t)] + 3D^2f_{y(t)}[(y''(t))^2]\\
& + 4 D^2f_{y(t)}[y'(t), y'''(t)] + Df_{y(t)}[y^{(4)}(t)].\tag{$****$}
\end{align}
This is a rather complicated mess, so let's do some sanity checks. THe first term involves $D^4f_{y(t)}$ this should be a $4^{th}$-order multilinear map, so the fact that it is evaluated on $[(y'(t))^4]=[y'(t),y'(t),y'(t),y'(t)]\in V^4$ makes sense. The next term is $6D^3f_{y(t)}[(y'(t))^2,y''(t)]$; this involves $D^3f$, so the fact it is evaluated on the $3$-tuple of elements $[y'(t),y'(t),y''(t)]\in V^3$ makes sense, and so on for the other terms: each of the $D^2f$ terms are evaluated on a pair of vectors, and the $Df$ term is evaluated on a single vector.