RSA encryption and decryption with multiple prime modulus using CRT

Question

Every information I found on internet about RSA-CRT encryption/decryption uses only two primes. I'm interested in my project in doing that using multiple (up to 8) primes.

The general idea is to calculate $d_p = d\bmod(p-1)$, $d_q = d\bmod(q-1)$, and $q_\text{inv} = q^{-1}\bmod p$, where $p$ and $q$ are primes.

Encryption and decryption is based on "logical relations" between $p$ and $q$ and I'm unable to expand it to more than two primes. Can anybody explain how to use multiple primes?

RFC 3447, the IETF republication of PKCS1v2.1, has been on the internet since 2003 and specifies 'multi-prime' RSA (meaning more than 2 factors). Technically it was superseded by RFC8017 for v2.2 in 2016, but this content did not change. — dave_thompson_085, Sep 21 '21 at 22:49

fgrieu · Accepted Answer · 2021-09-21T20:06:09.537

The RSA private-key operation (used for decryption and signature generation) amounts to solving for $x$ the equation $y\equiv x^e\pmod N$, knowing $y$, the factorization of the public modulus $N$ into $k\ge2$ distinct primes $N=r_1\dots r_k$, public exponent $e$ such that $\gcd(e,r_i-1)\ne1$, and that $0\le x<N$.

For an efficient implementation, we can solve this equation modulo each of the $r_i$; then use the CRT to combine solutions between products of moduli for which we already have a solution, until reaching a solution modulo $N$. The common way, implicit in PKCS#1v2 since version 2.1, is:

precompute the following quantities $d_i$ (the CRT exponents) and $t_i$ (the CRT inverses/coefficients), e.g. at key generation time, including the results in the private key:
- for $i\in\{1,\dots,k\}$
  - $d_i\gets e^{-1}\bmod(r_i-1)$, or equivalently $d_i\gets d\bmod(r_i-1)$
- $m\gets r_1$
- for $i$ from $2$ to $k$ - $t_i\gets m^{-1}\bmod r_i$ - $m\gets m\cdot r_i$
when needing to use the private key and solve $y\equiv x^e\pmod N$
- for $i\in\{1,\dots,k\}$ [note: should be parallelized if possible]
  - $x_i\gets(y\bmod r_i)^{d_i}\bmod r_i$
- $x\gets x_1$, $m\gets r_1$
- for $i$ from $2$ to $k$ [loop invariant: $0\le x<m$, $y\equiv x^e\pmod m$ ]
  - $x\gets x+m\cdot((x_i-x\bmod r_i)\cdot t_i\bmod r_i)$
  - $m\gets m\cdot r_i$

Correctness follows from the loop invariant. See this question for attribution. See this other one for how the bitsize of $N$ relates to a maximum reasonable number of primes. This is known as Garner’s algorithm, see the Handbook of Applied Cryptography, section 14.5.2.

Artificially small example with 3 primes:

e=5
r1=931164518537359 r2=944727352543879 r3=982273258722607
N=864102436520313334659779717201860718296307527
d1=558698711122415 d2=566836411526327 d3=785818606978085
                   t2=360227672914825 t3=882117903741868
y=529481440313141057262802385309623737292746309
x1=436496882968258 x2=903092574358267 x3=806961802724
x=710532117316769399313215266414 (when i=2)
x=111222333444555666777888999000000000000000042

The effort saved compared to a standard (non-CRT) implementation is by a factor at most (and near) $k^2$, if modular multiplication has cost $\mathcal O(n^2)$ for arguments of $n$ bits. The time saved can be higher, up to a factor at most (and near) $k^3$ if parallelization is used on $k$ independent modexp units.

It is critical to make a final check that $y\equiv x^e\pmod N$, and not disclose $x$ otherwise. If this precaution was not taken, the implementation would be vulnerable to the cardinal "Bellcore" fault attack: D. Boneh, R. A. DeMillo, R. Lipton; On the Importance of Eliminating Errors in Cryptographic Computations (in Journal of Cryptology 14(2), 2001).

Implementations should be adequately protected from a variety of other attacks, including timing, power analysis, and other side-channel attacks.

The question also mentions encryption, where only the public key $(N,e)$ is known, not the factorization of $N$. Hence, for that RSA public-key operation (also used for signature verification), there is no similar shortcut applying to the computation $y\gets x^e\bmod N$. However, typically, that remains of low cost compared to the RSA private-key operation, because $e$ typically is small.

Late addition: $e$ must be coprime with each of the $r_i-1$. Typically it's first chosen an odd $e$, and the $r_i$ to match this condition. The range for $e$ is a subject of debate, see discussion here and here. My opinion is that implementation attacks aside, and with padding having a security proof (RSA-KEM, RSAES-OAEP, RSASSA-PSS, ISO/IEC 9796-2 schemes 2 or 3), there's no good reason for a minimum $e$ larger than $3$; and one will not get fired for using $e=2^{16}+1$, which matches the prescription $2^{16}<e<2^{256}$ of FIPS 186-4, and is a Fermat number $F_i=2^{\left(2^i\right)}+1$ which allows the best efficiency for a given size of $e$, and prime which makes choices of $r_i$ slightly easier and in a wider set.

RSA encryption and decryption with multiple prime modulus using CRT

1 Answers1

Linked