An arbitrary ellipse can be parameterised as $x=h+a\cos t,\, y=k+b\cos t$ $(0\leq t\leq 2\pi)$, which is an ellipse centred at $(h,k)$, with axis lengths $a$ and $b$.
Minimising the square of the distance is an equivalent problem to minimising the distance.
Approach 1:
The point $(x_2,y_2)$ lies on the above parametised ellipse.
Then,
$$f=(x_1-h-a\cos t)^2 +(x_2-k-b\sin t)$$
is the distance to any point on the ellipse.
As $d$ is a simple function of $t$ only, the value of $t$ which is minimised when $\dfrac{d f}{d t}=0$.
Approach 2:
The point $p_1=(x_1,y_1)$ can be written as
$$p_1=p_2+\lambda n$$
where $p_2=(x_2,y_2)$ is the point on the ellipse and $n$ is the vector normal to the ellipse. Using the above parameterisation for the ellipse, you obtain the system of nonlinear equations
$$\begin{bmatrix}x_1\\y_1\end{bmatrix}=\begin{bmatrix}h+a\cos t\\k+b\sin t\end{bmatrix}+\lambda \begin{bmatrix}b\cos t\\a\sin t\end{bmatrix},$$
which has to be solved for $t$ and $\lambda$.
Approach 3:
A third way to formulate this problem is as a constrained optimisation problem, so that this is formulated as
$$\min\limits_{x_1,x_2} f = (x_1-x_2)^2+(y_1-y_2)^2$$ subject to $$\frac{(x_2-h)^2}{a^2}+\frac{(y_2-k)^2}{b^2}-1=0.$$
This can be solved using lagrange multipliers to give
$$L(x_1,x_2,\lambda)=(x_1-x_2)^2+(y_1-y_2)^2+\lambda\left(\frac{(x_2-h)^2}{a^2}+\frac{(y_2-k)^2}{b^2}-1\right).$$
The optimal solution can be found by solving the system of equations $$\frac{d L}{d x_1}=0,\,\frac{d L}{d x_2}=0,\,\frac{d L}{d \lambda}=0.$$