So in the book Analysis on Manifolds by Munkres it's said that the directional derivative is not a good candidate for the generalization of the notion of the derivative of a function defined on $\mathbb{R}^m$.
Definition (1): Let $A \subseteq \mathbb{R}^m$ and let $f : A \to \mathbb{R}^n$. Suppose $A$ contains a neighborhood of $a$. Given $u \in \mathbb{R}^m$ with $u \neq 0$, define $${f'}_a(u) = \lim_{t \to 0}\frac{f(a+tu)-f(a)}{t}$$ provided the limit exists. This limit is called the directional derivative of $f$ at $a$ with respect to $u$.
Munkres in his book states that composites of differentiable functions need not be differentiable using this definition. He goes on to say that the right generalization is given by the following definition
Definition (2): Let $A \subseteq \mathbb{R}^m$, let $f : A \to \mathbb{R}^n$. Suppose $A$ contains a neighborhood of $a$. We say that $f$ is differentiable at $a$ is there exists a $n \times m$ matrix $B$ such that $$\lim_{h \to 0} \frac{f(a+h)-f(a) - B\cdot h}{|h|} = 0$$. The matrix $B$ is called the derivative of $f$ at $a$.
Now in the book Differential Topology by Guillemin and Pollack (and even Topology from the Differentiable Viewpoint by Milnor), the definition of the "derivative" that's used is Definition $(1)$ above. They basically go on to formulate Differential Topology using this definition of the derivative.
In particular they claim that the chain rules exists for Definition $(1)$ of the the derivative. I don't see how this can be possible though based on what Munkres said in his book.
Furthermore from what Munkres said, that Definition $(1)$ is not the correct generalization of the derivative, why have Milnor and Guillemin and Pollack taken the incorrect generalization and used it to (as it seems to me) formulate Differential Topology on it? (I hope that is not too strong a thing to say, because they define tangent spaces using this seemingly incorrect generalization of the derivative)
Who is correct then, Munkres, or G&P and Milnor?