Let's see if this helps. Let $\mathbf{x}_1 = (x_1,y_1,z_1)$ and $\mathbf{x}_2 = (x_2,y_2,z_2)$ be two points lying on the circles 1 and 2, respectively. Then, the parametric equations for these points are given by (see here for a derivation):
\begin{align}
\mathbf{x}_1 & = \mathbf{c}_1 + r_1 (\cos \theta \, \mathbf{a} + \sin \theta \,\mathbf{b}),\\
\mathbf{x}_2 & = \mathbf{c}_2 + r_2 (\cos \lambda \, \mathbf{p} + \sin \lambda \,\mathbf{q}),
\end{align} for $\theta, \lambda \in [0,2\pi)$. The way to compute the unit vectors $\mathbf{a}$ and $\mathbf{b}$ and $\mathbf{p}$ and $\mathbf{q}$ from $\mathbf{n}_1$ and $\mathbf{n}_2$ is explained in the link above. Now, the distance between points pertaining to each circle is given by:
$$ d(\mathbf{x}_1,\mathbf{x}_2) = \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2 + (z_1 - z_2)^2}, $$ which depends exclusively on $\theta$ and $\lambda$. This will become your function to maximize, or more conveniently $d^2$. The set of critical points is then given by the equation:
$$ \partial_\theta d^2 = 0, \quad \partial_\lambda d^2 = 0,$$ i.e., the gradient of $d^2$, with respect to the independent variables, $\theta$ and $\lambda$, must be zero (necessary condition for optimality).
Edit. This is definitely wrong: I don't know if setting $\lambda = \theta$, in order to make $d$ only dependent on one variable, makes any sense, since there's no restriction for doing that (I strongly think that this is not legit).
Hope this helps.
Cheers.