In RSA, exponents are small, so the encryption/decryption can be done quickly. Here, exponent is of the form $2^t$, where $t$ is very large. For example, consider RSA 2048 - exponents have at most 2048 bits. If you set $2^t=2048=2^{11}$, then to solve this puzzle one will need to do only 2048 squarings, roughly the same as you need to do a decryption in RSA. But if you set $t=2^{128}$, then one will need to do $2^{128}$ squarings, which is far outside from current computational powers.
But if you can factorize the number, then you know the order of the group and can reduce exponent modulo this order: $a^{2^t} = a^{2^t mod \phi(n)}$.
Similarity with RSA is that both problems are based on exponentiation in so-called "RSA" group (multiplication mod $pq$) and both used the fact that for an attacker the group order is unknown.
For choice of primes numbers I recommend to read the wiki article. In brief, the primes should have the same size in bits and each of the primes should be of the form $p=2q+1$, where $q$ is some another prime, to defend from Pollard's p-1 method. Also $p+1$ must also have large prime factors, to defend from Pollard's p+1 method. If you choose primes randomly of the form $p=2q+1$ then everything will be fine. Actually if you simply choose random primes of specified size (as done in the article) then chances that it is breakable are very negligible.