I believe the following works. It's a more direct proof inspired by this paper.
Let $\vec{u}=(u_1,u_2)$ be a unit vector and $f:\mathbb{R}^2\rightarrow \mathbb{R}$ a function differentiable at $(a,b)$. The result is easy to prove when either $u_1$ or $u_2$ is zero (see the paper for more info), so I'll assume $u_1,u_2\neq 0$.
$$\frac{\partial f}{\partial \vec{u}}:=\lim_{h\rightarrow 0}\frac{f(a+hu_1,b+hu_2)-f(a,b)}{h}$$
$$=\lim_{h\rightarrow 0}\frac{f(a+hu_1,b+hu_2)-f(a,b+hu_2)}{hu_1}u_1+\lim_{h\rightarrow 0}\frac{f(a,b+hu_2)-f(a,b)}{hu_2}u_2$$
The right-most term is clearly equal to $\partial _yf(a,b)u_2$. Applying the Mean Value Theorem on the left-most term we have
$$=\lim_{h\rightarrow 0}\partial_xf(\alpha(h),b+hu_1)u_1+\partial _yf(a,b)u_2$$
where $\alpha(h)$ is between $a$ and $a+hu_1$, meaning $\alpha(h)\rightarrow a$ as $h\rightarrow 0$. Thus the above is equal to
$$\partial_xf(a,b)u_1+\partial _yf(a,b)u_2$$
as wanted.
The proof can be generalized to other dimensions by expanding $\partial f/\partial \vec{u}$ so that each term varies only in one of its dimensions. For example, for a function $f:\mathbb{R}^3\rightarrow \mathbb{R}$ we would begin the proof by expanding $\partial f/\partial \vec{u}$ as
$$\frac{f(a+hu_1,b+hu_2,c+hu_3)-f(a,b+hu_2,c+hu_3)}{hu_1}u_1
+\frac{f(a,b+hu_2,c+hu_3)-f(a,b,c+hu_3)}{hu_2}u_2
+\frac{f(a,b,c+hu_3)-f(a,b,c)}{hu_3}u_3$$
and we would need to use the Mean Value Theorem in all terms except the last.