6

The pycrypto library in Python can generate random n-bit prime numbers. The syntax I use is as follows:

from Crypto.Util import number
number.getPrime(2048)

The above function has a very impressive performance and returns primes with a very small delay. What is the process used to generate such large primes in such short time periods in this function?

Tabish Mir
  • 258
  • 2
  • 13

1 Answers1

9

The documentation is not directly telling the implemented algorithm. One can check from the source code. getPrime uses isPrime and that calls the Rabin-Miller Primality test.

  • getPrime generates a random odd number $\texttt{N}$ and calls isPrime

    number=getRandomNBitInteger(N, randfunc) | 1
         while (not isPrime(number, randfunc=randfunc)):
             number=number+2
    
  • isPrime first checks for evenness and for pre-calculated Sieve primes, that list is the first 10000 primes. It may be a prime in the Sieve or divisible by one of them. If none of the cases, then the Rabin-Miller test is performed.

The Probability: The returned value of getPrime, if a probable prime, then the probability is given by $$ 1 - \frac{1}{4^k}$$ where $k$ is the number of iterations.

The Number of iterations: The Library defines

false_positive_prob=1e-6

calculates the $k$ by

k = int(math.ceil(-math.log(false_positive_prob)/math.log(4)))

and from this, the number of iteration in the library is $k=10$.

Note that In my undergraduate, we used $k=20$. That makes false positive in the worst case 1e-12 where the library has 1e-6.

The Complexity: If modular exponentiation by repeated squaring is used then the complexity is $\mathcal{O}(k \log^3 n)$ where $k$ is the number of iterations to test that determines the probability.

kelalaka
  • 48,443
  • 11
  • 116
  • 196
  • This answer feels slightly incomplete. It doesn't mention what isPrime gets run on in the first place (a randomly sampled $n$ bit number). – puzzlepalace Feb 05 '19 at 18:46
  • 1
    This generates a prime with a bias: the probability of a prime being selected grows linearly with how far above the previous prime it is. – fgrieu Feb 05 '19 at 19:55
  • @fgrieu this is due to the library's design or Rabin-Miller? – kelalaka Feb 05 '19 at 20:01
  • 1
    @kelalaka: this is due to moving to the next candidate prime with number=number+2, rather than drawing a new random number, or moving to the next one with a more sophisticated procedure. – fgrieu Feb 05 '19 at 20:07
  • @fgrieu assuming that getPrime succeeds after considering a relatively small number of candidates (say, no more than 10^100) it means the process selects a prime from the region between a truly random x and x+210^100. For 2048 bit numbers, that's pretty much a uniform distribution, as the relative bias is 210^100 / 2^2048 ≈ 10^-516 ≈ 0. A much more serious issue is the 10^-6 possibility of false positives. – Peteris Feb 05 '19 at 20:56
  • @Peteris: The spacing of primes around $p$ is in the order of $\log(p)$, thus the number of candidates tested for a 2048-bit prime is usually in the hundreds, rarely many thousands. I'm not saying that the bias endangers the safety of the key. My point is that it is detectable with a moderate number of private keys (and I'm wondering for public keys). – fgrieu Feb 05 '19 at 22:54
  • I think it's more important to ask how the witness number is generated in isPrime. If it is generated in a predictable way, then it would be possible to construct composite numbers that fail many Miller-Rabin tests. – forest Feb 06 '19 at 04:14
  • @forest The library lets you choose the random function if not supplied Random.new() is used. Line 175 at the source – kelalaka Feb 06 '19 at 08:03
  • 1
    1e-6 is a worst case probability that isPrime yields true for any fixed composite integer, and is a fair estimate for a Carmichael number. However these get vanishingly rare when moving to large ones, and for other composites the probability that isPrime yields true is much lower, and tends to decreases when the number gets larger. Therefore the probability that getPrime returns a composite is much lower than 1e-6 (which would be perfectly detectable, and a serious issue). – fgrieu Apr 25 '20 at 12:41