This problem can be solved by the Polya Enumeration Theorem. Ignore
the no spot cards for the moment as they only contribute trivial
symmetries. The setup here is that we have $8\times 4$ slots into
which we distribute assignments to players $A,B,C$ and $D.$ We have
the symmetric group $S_8$ acting on each block of spot cards from the
same suit. This gives the cycle index $$Z(Q) =Z(S_8)^4.$$
It follows that the number of deals where player $A$ receives $a$ spot
cards, player $B$ receives $b$ spot cards and so on is given by
$$[A^a B^b C^c D^d] Z(S_8)^4(A+B+C+D).$$
If all four degrees are at most thirteen we can combine such an
assignment with an assigment of the no spot cards of the remaining
cards, which is given by a simple multinomial coefficient, for a
contribution of
$${20\choose 13-a,13-b,13-c,13-d}
[A^a B^b C^c D^d] Z(S_8)^4(A+B+C+D).$$
It remains to sum these terms from $Z(Q)$ to get the answer, which is
$$800827437699287808.$$
Observe that all of these terms have $a+b+c+d=32.$
Note also that the multinomial corresponds to multiplying $Z(Q)$ by
$a_1^{20},$ representing twenty fixed points for the no spot cards
which are not being permuted.
Here the computation features the recurrence by Lovasz for the cycle index $Z(S_n)$, which is
$$Z(S_n) = \frac{1}{n} \sum_{l=1}^n a_l Z(S_{n-l})
\quad\text{where}\quad
Z(S_0) = 1.$$
This was the Maple code that I used.
with(combinat);
with(numtheory);
pet_cycleind_symm :=
proc(n)
local p, s;
option remember;
if n=0 then return 1; fi;
expand(1/n*add(a[l]*pet_cycleind_symm(n-l), l=1..n));
end;
pet_cycleind_idspots := pet_cycleind_symm(8)^4;
pet_varinto_cind :=
proc(poly, ind)
local subs1, subs2, polyvars, indvars, v, pot, res;
res := ind;
polyvars := indets(poly);
indvars := indets(ind);
for v in indvars do
pot := op(1, v);
subs1 :=
[seq(polyvars[k]=polyvars[k]^pot,
k=1..nops(polyvars))];
subs2 := [v=subs(subs1, poly)];
res := subs(subs2, res);
od;
res;
end;
count :=
proc()
option remember;
local sind, res, term, Ad, Bd, Cd, Dd;
sind := pet_varinto_cind(A+B+C+D, pet_cycleind_idspots);
res := 0;
for term in expand(sind) do
Ad := degree(term, A);
Bd := degree(term, B);
Cd := degree(term, C);
Dd := degree(term, D);
if Ad<=13 and Bd<=13 and Cd<=13 and Dd<= 13 then
res := res + term/A^Ad/B^Bd/C^Cd/D^Dd*
20!/(13-Ad)!/(13-Bd)!/(13-Cd)!/(13-Dd)!;
fi;
od;
res;
end;
The output of the Maple program is as follows.
The timing here was less than one tenth of a second.
> count();
memory used=37195.2MB, alloc=8.3MB, time=436.90
memory used=37197.7MB, alloc=8.3MB, time=436.94
800827437699287808
This matches the value presented at the linked web site.
A somewhat more advanced computation involving suits being distributed is at this MSE link.