I've been working on this problem for a while. Even though this question is from quite a long time ago, I haven't seen an answer which is similar to this one, so I'm posting it mainly because this question appears rather commonly (I first saw it as an exercise in Warner's book).
I'm not going to prove that $\alpha'(0)$ exists, but rather that $\alpha$ defines a derivation at $t=0$ that is equal to the Lie bracket. More precisely, for any $f\in \mathfrak{F}(M)$:
$$
\lim_{t\to 0^{+}}\frac{f(\alpha(t))-f(p)}{t}=[X,Y]_{p}(f).
$$
If the derivative did exist, then this equation would imply that $\alpha'(0)=[X,Y]_{p}$, but paraphrasing Warner's book, the $\sqrt{t}$ makes it so that the derivative may not exist.
First, define $\beta(t)=\Phi_{-t}^{Y}\circ \Phi_{-t}^{X} \circ \Phi_{t}^{Y} \circ \Phi_{t}^{X}(p)$, so that $\alpha(t)=\beta(\sqrt{t})$ for every $t>0$ sufficiently small. Since the square root is a homeomorphism from $[0,\infty)$ to itself, it suffices to prove that
$$
\lim_{s\to 0^{+}}\frac{f(\alpha(s^{2}))-f(p)}{s^{2}}=\lim_{s\to 0^{+}}\frac{f(\beta(s))-f(p)}{s^{2}}=[X,Y]_{p}(f).
$$
By using L'Hôpital's rule twice, this limit can be rewritten as
$$
\lim_{s\to 0^{+}}\frac{(f\circ \beta)''(s)}{2}=\frac{1}{2}(f\circ \beta)''(0).
$$
So, in the end we have reduced the problem to computing the following second derivative:
$$
\frac{d^{2}}{dt^{2}}|_{t=0}f(\beta(t))=\frac{d^{2}}{dt^{2}}|_{t=0}f(\Phi_{-t}^{Y}\circ \Phi_{-t}^{X} \circ \Phi_{t}^{Y} \circ \Phi_{t}^{X}(p)).
$$
A common trick for evaluating this kind of derivatives is to "separate the $t$'s". If we define
$$
H(a,b,c,d)=f(\Phi_{a}^{Y}\circ \Phi_{b}^{X} \circ \Phi_{c}^{Y} \circ \Phi_{d}^{X}(p)), g(t)=H(-t,-t,t,t),
$$
then $(f\circ \beta)''(0)=g''(0)$. By using the chain rule twice, this equals
$$
g''(0)=H_{aa}(0)+H_{bb}(0)+H_{cc}(0)+H_{dd}(0)+2H_{ab}(0)-2H_{ac}(0)-2H_{ad}(0)-2H_{bc}(0)-2H_{bd}(0)+2H_{cd}(0).
$$
There are a lot of second derivatives to compute, but they are relatively easy (I can add an example if needed), and by the end you'll get
$$
g''(0)=YYf(p)+XXf(p)+YYf(p)+XXf(p)+2XYf(p)-2YYf(p)-2XYf(p)-2YXf(p)-2XXf(p)+2XYf(p) \\
=2XYf(p)-2YXf(p)=2[X,Y]_{p}(f).
$$
Now, plug this result into the original limit, and we get
$$
\lim_{t\to 0^{+}}\frac{f(\alpha(t))-f(p)}{t}=[X,Y]_{p}(f).
$$
Hope this helps! If anyone believes there's some error in the proof, I'd like to know, since I haven't seen this argument anywhere else.
EDIT: I'm aware that this solution uses explicitly the chain rule (which the author wanted to avoid), but since it is used in the simplest way I could think of (that is, using it with a function from $\mathbb{R}^{4}$ to $\mathbb{R}$), I hope it's forgivable.