Say $\mathbf{v}$ is a unit vector and $f:\mathbb{R}^n\to\mathbb{R}$ a scalar function.
The directional derivative of $f$ at $\mathbf{x}$ in the direction of $\mathbf{v}$ is
$$ D_{\mathbf{v}}f(\mathbf{x}) = \lim_{h\to0} \frac{f(\mathbf{x}+h\mathbf{v})-f(\mathbf{x})}{h}. $$
If we interpret $f(\mathbf{x}+h\mathbf{v})$ as a function of $h$ with $\mathbf{x},\mathbf{v}$ fixed, this is
$$\begin{array}{l} \displaystyle \frac{\mathrm{d}}{\mathrm{d}h}f(\mathbf{x}+h\mathbf{v}) &= \displaystyle\frac{\partial f}{\partial x_1}\frac{\partial (x_1+hv_1)}{\partial h}+\cdots+\frac{\partial f}{\partial x_n}\frac{\partial(x_n+hv_n)}{\partial h} \\[5pt] & \displaystyle = \frac{\partial f}{\partial x_1}v_1+\cdots+\frac{\partial f}{\partial x_n}v_n \end{array} $$
at $h=0$ (so all the partials $\partial f/\partial x_i$ are evaluated at $\mathbf{x}$) by the multivariable chain rule.
This is just the dot product $D_{\mathbf{v}}f(\mathbf{x})=\nabla f(\mathbf{x})\cdot \mathbf{v}$ where $\nabla f$ is the gradient.
Rearranging, this may be written as
$$ \frac{f(\mathbf{x}+h\mathbf{v})-f(\mathbf{x})-\nabla f(\mathbf{x})\cdot(h\mathbf{v})}{h}\to0 \quad \textrm{as }h\to0. $$
With the substitution $\mathbf{h}=h\mathbf{v}$ this becomes
$$\frac{f(\mathbf{x}+\mathbf{h})-f(\mathbf{x})-\nabla f(\mathbf{x})\cdot\mathbf{h}}{\|\mathbf{h}\|}\to0 \quad \textrm{as }\|\mathbf{h}\|\to0. $$
The derivative of $f$ at $\mathbf{x}$ in this case is a vector $\nabla f(\mathbf{x})\in\mathbb{R}^n$ depending on $\mathbf{x}$.
More generally one can do the same thing to vector functions $f:\mathbb{R}^n\to\mathbb{R}^m$, in which case a linear function $Df:\mathbb{R}^n\to\mathbb{R}^m$ will be applied to $\mathbf{h}$ instead of a dot product with a vector. (This is a generalization since any linear function $\mathbb{R}^n\to\mathbb{R}$ is just a dot product with some vector.)