This question is similar to this one.
I am unsure about the assumptions made in Theorem 6.20 of Apostol's Mathematical Analysis, first edition:
Let $S \subseteq \mathbb{R}^2$ be open and $f\colon S \to \mathbb{R}$. If $D_1 f$, $D_2 f$ and $D_{2,1} f$ are continuous in a neighbourhood of the point $(x_0, y_0) \in S$, then $D_{1,2} f(x_0, y_0)$ exists and is equal to $D_{2,1} f(x_0, y_0)$.
The theorem and proof is basically the same as the one featured in this blog post, also mentioned in the above question.
By definition of partial derivatives, we need to prove that the limit
$$ \lim_{h \to 0} \frac{ D_2 f(x_0 + h, y_0) - D_2 f(x_0, y_0) }{h}$$
exists and is equal to $D_{2,1} f(x_0, y_0)$. For fixed $k$, introducing the function
$$ g_k(t) = f(x_0 + t, y_0 + k) - f(x_0 + t, y_0) $$
lets us write the numerator above as
$$ D_2 f(x_0 + h, y_0) - D_2 f(x_0, y_0) = \lim_{k \to 0} \frac{g_k(h) - g_k(0)}{k}.$$
Since $D_1 f$ exists (we do not require continuity of $D_1 f$) in a neighbourhood around $x_0, y_0$, $g_k$ is differentiable (with the obvious derivative) in a suitably small open interval around $0$. If we require $h$ to be inside this interval, the mean value theorem yields a $\overline h$ between $0$ and $h$ such that
$$ \frac{g_k(h) - g_k(0)}{k} = h \frac{g_k'(\overline h)}{k} = h \frac{D_1 f(x_0 + \overline h, y_0 + k) - D_1 f(x_0 + \overline h, y_0)}{k}.$$
For convenience, define the function (Apostol skips the details of the following steps)
$$ \phi(s) = D_1 f(x_0 + \overline h, y_0 + s) - D_1 f(x_0 + \overline h, y_0),$$
analogous to the above function $g_k$. By the existence (again, not continuity) of $D_{2,1}$ in a neighbourhood around $(x_0, y_0)$, $\phi$ is differentiable in an open interval around $y_0$. For fixed $k$, the mean value theorem then gives us a $\overline y$ between $y_0$ and $y_0 + k$ such that
$$ \frac{D_1 f(x_0 + \overline h, y_0 + k) - D_1 f(x_0 + \overline h, y_0)}{k} = D_{2,1} f(x_0 + \overline h, \overline y). $$
In total we get
$$ \frac{ D_2 f(x_0 + h, y_0) - D_2 f(x_0, y_0) }{h} = \lim_{k \to 0} D_{2,1}f(x_0 + \overline h, \overline y),$$
so the original limit is equal to
$$\lim_{h \to 0} \lim_{k \to 0} D_{2,1}f(x_0 + \overline h, \overline y).$$
So far we have not, as far as I can see, used continuity at all.
We cannot evaluate the limit directly, so introduce the function
$$ F(h) = \lim_{k \to 0} D_{2,1}f(x_0 + \overline h, \overline y), $$
i.e. just the inner limit as a function of $h$. Now by continuity of $D_{2,1}f$ at $(x_0, y_0)$ (that is, not in a neighbourhood around the point, only at the point itself), given $\epsilon > 0$ there is some $\delta > 0$ such that for $(x,y) \in N((x_0, y_0); \delta)$ we have
$$ \lVert D_{2,1} f(x,y) - D_{2,1}f(x_0, y_0) \rVert < \frac{\epsilon}{2}.$$
Choosing $h$ and $k$ such that $\lvert h \rvert, \lvert k \rvert < \delta/2$, the point $(x_0 + \overline h, \overline y)$ lies in this $\delta$-neighbourhood, so that
$$ 0 \leq \lVert D_{2,1} f(x_0 + \overline h, \overline y) - D_{2,1}f(x_0, y_0) \rVert < \frac{\epsilon}{2}.$$
Now taking the inner limit $k \to 0$ we get, by continuity of the norm,
$$ 0 \leq \lVert F(h) - D_{2,1}f(x_0, y_0) \rVert \leq \frac{\epsilon}{2} < \epsilon.$$
As far as I can tell, this does not require continuity of any of the derivatives, but only relies on the definition of the function $F$. This proves the theorem.
Only I cannot see how we use any type of continuity of $D_1 f$ and $D_2 f$, and it seems like only continuity of $D_{2,1} f$ at the point itself is needed.
Apostol does not include a proof of this theorem in the second edition of Mathematical Analysis, but he does mention it (with the same assumptions) on page 360.