RSA: Construct private / public key for given cipher and plain text message

Question

let's say we are given a classical RSA encryption scheme, though we would like to "reverse" the task:

Given two messages $c, m$ choose $p, q, e$ such that $p, q$ are prime and $c ^ d \equiv m\pmod N$ with $d \cdot e \equiv 1 \pmod{(p - 1)(q - 1)}$; $N = p \cdot q$.

I was wondering how one could approach this problem? Is it even possible for some large N?

Also does the conditioning improve if we loosen the constraints of classic RSA, i.e. $GCD(e, LCM(p-1,q-1)) \neq 1$? On another question I have read, that it is possible to construct message collisions, if the parameters are ill-formed, i.e. the condition $GCD(e, LCM(p-1,q-1)) = 1$ is not satisfied. I thought it to be related, though I could not retrieve the information of how such an collision could be achieved (or if it is even feasable).

This question is likely related to an IT security competition, which published this exact problem on April 1st. Please do not cheat! — leoluk, Apr 10 '20 at 20:50

fgrieu · Accepted Answer · 2020-06-01T04:29:19.067

Note: klugreuter's answer outlined a different approach to the problem, that I developed here. This obsoletes the following, except perhaps when we want a small $e$.

We can compute $\mathbin|m^e-c\mathbin|$ for various small odd $e>1$, and try to factor them (even partially). As soon as we get two distinct primes factors $p$ and $q$ for some $\mathbin|m^e-c\mathbin|$, with $\gcd(p-1,e)=1$ and $\gcd(q-1,e)=1$, $m<p\,q$ (and $c<p\,q$ if that's added to the problem statement), then we can compute $N$ and a $d$ matching $e$ for this $N$ by one of the methods used in RSA, and that solves the problem. That's at least sometime feasible by pulling moderate factors using ECM factoring (e.g. using GMP-ECM).

If $N$ is required to be large (including, because $m$ is; $c$ has lesser influence), it is hard to find $p$ and $q$ with large enough a product. But we can sometime find more than two distinct large primes dividing $m^e-c$, and compute $N$ and $d$ as in multi-prime RSA; that improves our chances to find $(N,e)$ passing scrutiny (but does not answer the problem as worded, since it requires $N$ to be a bi-prime).

Addition: in the question as asked, there's no requirement that $c<N$ (that's normally part of RSA). When $m\ll c$, that allows to try the above with $e=1$, that is factor $c-m$, which is much smaller than above. If we get distinct prime factors $p$ and $q$ with $m<p\,q$ that gives a solution. We can hide that we started from $e=1$ by adding a multiple of $\operatorname{lcm}(p-1,q-1)$ to $e$ and/or $d$.

Late addition/hint: if we are allowed to use $e=\operatorname{lcm}(p-1,q-1)+1$ as hypothesized above, then at least one choice of $e$ not discussed above is worth consideration.

klugreuter · Answer 2 · 2020-04-28T14:51:39.750

2

Deliberately choosing $p$ and $q$ to be non-strong primes allows efficient computation of the discrete logarithm, therefore finding $e$.

The caveat here is that primitive roots only exist if $n=1,2,4,p^k, 2p^k$ - which requires one of $p$ or $q$ to be 2.

See below for a more sophisticated solution that doesn't have this restriction.

edited Apr 28 '20 at 14:51

answered Apr 26 '20 at 18:24

klugreuter

21
3

Pohlig–Hellman works better the weaker the primes are - which allows us to e.g. chose a prime succeeding a Hamming number. Feel free to challenge me by providing me with $m$ and $c$. – klugreuter Apr 26 '20 at 21:13
I don't get it. For a challenge, take $m$ as a bytestring of $k$ bytes each with value 0x33 and make $c=\left\lfloor\pi,2^{8k-2}\right\rfloor$, which in hex is the first $2k$ characters there, Choose the largest $k$ that you are comfortable with ($k=40$ makes a lower $N$ than was reasonable 4 decades ago for an RSA modulus, but would already be a nice demo). – fgrieu Apr 26 '20 at 21:29
1

Okay best I can do is m: 80bytes and c: 150bytes, see here – klugreuter Apr 26 '20 at 22:38
I don't know what the asymptotic complexity here is, but larger c take much longer to find weak primes (probably because of my naive implementation) and larger m reduce the chance of the log existing. Runtime was ~1 min with unoptimised python on my laptop. – klugreuter Apr 26 '20 at 22:43
I'm impressed: the values verify $d,e\equiv1\pmod{(p-1)(q-1)}$ as asked, $p$ prime, $q$ prime, and $m^e\equiv c\pmod{p,q}$. There is the problem that $c^d\not\equiv m\pmod{p,q}$ thus this is not per the question, but I believe this might be because $q=2$, and fixable. Now I understand your approach: indeed you choose $p$ and solve for $e$ so that the equations hold modulo $p$ (using that $p-1$ is hyper-smooth), same for $q$, then find the final $e$. While $m$ is not quite as I suggested, it's close enough. – fgrieu Apr 26 '20 at 23:31
1

here is another example, using $k$=100 for both $m$ and $c$ as requested. – klugreuter Apr 27 '20 at 13:03
That's perfect, congratulations! Sorry for initially messing up my check of your second submission. If that could be changed to make $p$ and $q$ distinct odd primes, and $p,q$ exactly $8k$ bits, that method likely would also solve my related question/challenge. – fgrieu Apr 27 '20 at 14:24
I don't understand your "primitive roots only exist if $n=1,2,4,p^k, 2p^k$ - which requires one of $p$ or $q$ to be $2$". Is there anything that I do not get in my description of your idea? – fgrieu Apr 28 '20 at 10:14
It's just my understanding of primitive roots: If neither one of $p$ and $q$ is 2 then $m$ can't be a primitive root, decreasing the probability that $e$ exists to near zero. On the other hand if there exist primitive roots, then there exist many and $c$ has a high probability of being one. This is also what happens with my script if $q$ is unequal to 2: Pohlig-Hellman is still fast but can't find any $e$. – klugreuter Apr 28 '20 at 14:05
1

I hope that I address these issues by the checks in the second bullet of steps 2/3 in my answer, which are fast and (hopefully) enough to ensure that there is precisely one solution to the DLPs, always odd, and always combining into an odd $e$. Preliminary results seem to show that the first two tests are enough to eliminate most failures in the DLP, and with 40-bit numbers everything seems to work. – fgrieu Apr 28 '20 at 14:29
I now have working Python 3 code. Just follow the Try it online in Python 3 link in my revised answer. – fgrieu Apr 29 '20 at 17:44

fgrieu · Answer 3 · 2020-06-01T04:31:00.357

This answer develops the idea in klugreuter's answer.

Problem statement: Given $m>1$ and $c>1$ with $m\ne c$, generate an RSA key $(N,e,d)$, valid per PKCS#1 and most implementations of RSA, such that $c=m^e\bmod N$ and $m=c^d\bmod N$.

In a nutshell: we'll choose primes $p$ and $q$ so that $p-1$ and $q-1$ are suitably smooth, find $u=e\bmod(p-1)$ and $v=e\bmod(q-1)$ by solving Discrete Logarithm Problems thus made relatively easy, then combine $u$ and $v$ into $e$ using the Chinese Remainder Theorem with moduli $p-1$ and $q-1$.

We'll navigate around a number of possible pitfalls:

The DLP finding $u$ such that $m^u\equiv c\pmod p$ must have a solution. We'll insure this by keeping $p$ only when $m$ is a generator of $\Bbb Z_p^*$; same for finding $v$ such that $m^v\equiv c\pmod q$.
$e$ must be odd, that is $u$ must be odd, so we'll generate $p$ so that $c$ is not a quadratic residue modulo $p$; same for $q$.
$\gcd(e,p-1)=1$ must hold. We'll insure this by rejecting $p$ when $\gcd(u,p-1)\ne1$; same for $q$.
The system of equations $e=u\pmod{p-1}$ and $e=v\pmod{q-1}$ has solutions (to be found by the CRT) subject to the condition $u\equiv v\pmod{\gcd(p-1,q-1)}$. We'll insure this by constructively generating $q$ so that $\gcd(p-1,q-1)=2$.

The algorithm goes:

Decide appropriate intervals for $p$ and $q$. We want $p\,q>\max(m,c)$, $p>3$, $q>3$.
Construct a prime $p$ in the desired interval
- as $p=2\,r+1$ with $r=\prod r_i$ where $r_i<b$ are odd primes below bound $b$ (say $b=2^{20}$).
- with $c^r\bmod p=p-1$, also $m^r\bmod p=p-1$ and $m^{(p-1)/r_i}\bmod p\ne1$ for each distinct $r_i$ (change one prime $r_i$ or its multiplicity to make another prime $p$ if that does not hold). Notice that the first two conditions imply that $p$ pass the strong pseudoprime test for base $c$ and $m$, and can thus be the first line of primality testing for $p$.
Find $u\in\big[1,p\big)$ with $c\equiv m^u\pmod p$, using the Pohlig-Hellman algorithm. If $\gcd(u,r)\ne1$, retry at 2. Notes:
- Pohlig-Hellman will be acceptably fast thanks to the moderate $b$ even with Pollard's rho or baby-step/giant-step to solve the DLP each $r_i$.
- The way we selected $p$ insures that there is precisely one solution $u$ (since $m\bmod p$ is a generator of the multiplicative group $\Bbb Z_p^*$), and that $u$ is odd (since $c\bmod p$ is a quadratic non-residue modulo $p$).
- The test $\gcd(u,r)\ne1$ will insure that $\gcd(e,p-1)=1$ ultimately holds. It rarely fails, and setting a minimum for the $r_i$ of step 2 helps lower the probability of that.
Construct a prime $q$ in the desired interval (possibly adjusted per $p$)
- as $q=2\,s+1$ with $s=\prod s_i$ where $s_i<b$ are odd primes below $b$ and $\gcd(r,s_i)=1$ (insuring that $\gcd(p-1,q-1)=2$).
- with $c^s\bmod q=q-1$, also $m^s\bmod q=q-1$ and $m^{(q-1)/s_i}\bmod q\ne1$ for each $s_i$ (change one prime $s_i$ or its multiplicity to make another prime $q$ if that does not hold).
Find $v\in\big[1,q\big)$ with $c\equiv m^v\pmod q$, as in 3. If $\gcd(v,s)\ne1$, retry at 4.
Compute public exponent $e\in\big[0,(p-1)(q-1)\big)$ with $u=e\bmod(p-1)$ and $v=e\bmod(q-1)$ per the Crt, e.g. as $e=(((q-1)^{-1}\bmod r)(u-v)\bmod r)\,(q-1)+v$. Notes:
- By construction of $e$, it holds $c\equiv m^e\pmod p$ and $c\equiv m^e\pmod q$.
- $p$ and $q$ are coprime, thus $c\equiv m^e\pmod{p\,q}$
- $c<p\,q$, thus $c=m^e\bmod(p\,q)$, much to our satisfaction!
- $\gcd(e,p-1)=1$ and $\gcd(e,q-1)=1$ always hold, thanks to tests at 3 and 5. In particular, $e$ is odd.
- $m<p\,q$ and $m\ne c$, thus $e\ne1$, thus $e\ge3$ since $e$ is odd.
- $e<(p-1)\,(q-1)$, thus $e<p\,q$, as required.
Compute public modulus $N=p\,q$, a private exponent $d$ (the smallest possible one is $d=e^{-1}\bmod((p-1)(q-1)/2)$ ), and if needed other private key parameters $d_p=d\bmod p$, $d_q=d\bmod q$, $q_\text{inv}=q^{-1}\bmod p$ as usual.

This it feasible for all common modulus sizes. The outcome should be accepted by most RSA implementations that do not enforce an upper limit on $e$.

Try it online in Python 3, solving this challenge for $k=64$ (512-bit modulus) in few seconds.

If we additionally wanted the modulus to resist factorization, I only see that we need to randomize the choice of $r_i$ and $s_i$, a larger $b$, and that the two largest prime factors of each of $r$ and $s$ are close enough to $b$, say $b>r_0>r_1>b/2$ and $b>s_0>s_1>b/2$. The later is in order to resist some amount of Pollard's p-1. $b=2^{48}$ should prevent casual attacks. For larger/safer $b$, using a faster algorithm such as index calculus would be useful to solve the DLP within Pohlig-Hellman.

I don't see how the idea could be adapted for implementations that enforce an upper limit on $e$ (e.g. $e<2^{32}$ which used to be the case in a Windows API and sometime remains enforced by some software, or $e<2^{256}$ as in FIPS 186-4).

RSA: Construct private / public key for given cipher and plain text message

3 Answers3

Linked