4

let's say we are given a classical RSA encryption scheme, though we would like to "reverse" the task:

Given two messages $c, m$ choose $p, q, e$ such that $p, q$ are prime and $c ^ d \equiv m\pmod N$ with $d \cdot e \equiv 1 \pmod{(p - 1)(q - 1)}$; $N = p \cdot q$.

I was wondering how one could approach this problem? Is it even possible for some large N?

Also does the conditioning improve if we loosen the constraints of classic RSA, i.e. $GCD(e, LCM(p-1,q-1)) \neq 1$? On another question I have read, that it is possible to construct message collisions, if the parameters are ill-formed, i.e. the condition $GCD(e, LCM(p-1,q-1)) = 1$ is not satisfied. I thought it to be related, though I could not retrieve the information of how such an collision could be achieved (or if it is even feasable).

  • 1
    This question is likely related to an IT security competition, which published this exact problem on April 1st. Please do not cheat! – leoluk Apr 10 '20 at 20:50

3 Answers3

2

Note: klugreuter's answer outlined a different approach to the problem, that I developed here. This obsoletes the following, except perhaps when we want a small $e$.


We can compute $\mathbin|m^e-c\mathbin|$ for various small odd $e>1$, and try to factor them (even partially). As soon as we get two distinct primes factors $p$ and $q$ for some $\mathbin|m^e-c\mathbin|$, with $\gcd(p-1,e)=1$ and $\gcd(q-1,e)=1$, $m<p\,q$ (and $c<p\,q$ if that's added to the problem statement), then we can compute $N$ and a $d$ matching $e$ for this $N$ by one of the methods used in RSA, and that solves the problem. That's at least sometime feasible by pulling moderate factors using ECM factoring (e.g. using GMP-ECM).

If $N$ is required to be large (including, because $m$ is; $c$ has lesser influence), it is hard to find $p$ and $q$ with large enough a product. But we can sometime find more than two distinct large primes dividing $m^e-c$, and compute $N$ and $d$ as in multi-prime RSA; that improves our chances to find $(N,e)$ passing scrutiny (but does not answer the problem as worded, since it requires $N$ to be a bi-prime).

Addition: in the question as asked, there's no requirement that $c<N$ (that's normally part of RSA). When $m\ll c$, that allows to try the above with $e=1$, that is factor $c-m$, which is much smaller than above. If we get distinct prime factors $p$ and $q$ with $m<p\,q$ that gives a solution. We can hide that we started from $e=1$ by adding a multiple of $\operatorname{lcm}(p-1,q-1)$ to $e$ and/or $d$.

Late addition/hint: if we are allowed to use $e=\operatorname{lcm}(p-1,q-1)+1$ as hypothesized above, then at least one choice of $e$ not discussed above is worth consideration.

fgrieu
  • 140,762
  • 12
  • 307
  • 587
2

Deliberately choosing $p$ and $q$ to be non-strong primes allows efficient computation of the discrete logarithm, therefore finding $e$.

The caveat here is that primitive roots only exist if $n=1,2,4,p^k, 2p^k$ - which requires one of $p$ or $q$ to be 2.

See below for a more sophisticated solution that doesn't have this restriction.

klugreuter
  • 21
  • 3
  • Pohlig–Hellman works better the weaker the primes are - which allows us to e.g. chose a prime succeeding a Hamming number. Feel free to challenge me by providing me with $m$ and $c$. – klugreuter Apr 26 '20 at 21:13
  • I don't get it. For a challenge, take $m$ as a bytestring of $k$ bytes each with value 0x33 and make $c=\left\lfloor\pi,2^{8k-2}\right\rfloor$, which in hex is the first $2k$ characters there, Choose the largest $k$ that you are comfortable with ($k=40$ makes a lower $N$ than was reasonable 4 decades ago for an RSA modulus, but would already be a nice demo). – fgrieu Apr 26 '20 at 21:29
  • 1
    Okay best I can do is m: 80bytes and c: 150bytes, see here – klugreuter Apr 26 '20 at 22:38
  • I don't know what the asymptotic complexity here is, but larger c take much longer to find weak primes (probably because of my naive implementation) and larger m reduce the chance of the log existing. Runtime was ~1 min with unoptimised python on my laptop. – klugreuter Apr 26 '20 at 22:43
  • I'm impressed: the values verify $d,e\equiv1\pmod{(p-1)(q-1)}$ as asked, $p$ prime, $q$ prime, and $m^e\equiv c\pmod{p,q}$. There is the problem that $c^d\not\equiv m\pmod{p,q}$ thus this is not per the question, but I believe this might be because $q=2$, and fixable. Now I understand your approach: indeed you choose $p$ and solve for $e$ so that the equations hold modulo $p$ (using that $p-1$ is hyper-smooth), same for $q$, then find the final $e$. While $m$ is not quite as I suggested, it's close enough. – fgrieu Apr 26 '20 at 23:31
  • 1
    here is another example, using $k$=100 for both $m$ and $c$ as requested. – klugreuter Apr 27 '20 at 13:03
  • That's perfect, congratulations! Sorry for initially messing up my check of your second submission. If that could be changed to make $p$ and $q$ distinct odd primes, and $p,q$ exactly $8k$ bits, that method likely would also solve my related question/challenge. – fgrieu Apr 27 '20 at 14:24
  • I don't understand your "primitive roots only exist if $n=1,2,4,p^k, 2p^k$ - which requires one of $p$ or $q$ to be $2$". Is there anything that I do not get in my description of your idea? – fgrieu Apr 28 '20 at 10:14
  • It's just my understanding of primitive roots: If neither one of $p$ and $q$ is 2 then $m$ can't be a primitive root, decreasing the probability that $e$ exists to near zero. On the other hand if there exist primitive roots, then there exist many and $c$ has a high probability of being one. This is also what happens with my script if $q$ is unequal to 2: Pohlig-Hellman is still fast but can't find any $e$. – klugreuter Apr 28 '20 at 14:05
  • 1
    I hope that I address these issues by the checks in the second bullet of steps 2/3 in my answer, which are fast and (hopefully) enough to ensure that there is precisely one solution to the DLPs, always odd, and always combining into an odd $e$. Preliminary results seem to show that the first two tests are enough to eliminate most failures in the DLP, and with 40-bit numbers everything seems to work. – fgrieu Apr 28 '20 at 14:29
  • I now have working Python 3 code. Just follow the Try it online in Python 3 link in my revised answer. – fgrieu Apr 29 '20 at 17:44
2

This answer develops the idea in klugreuter's answer.

Problem statement: Given $m>1$ and $c>1$ with $m\ne c$, generate an RSA key $(N,e,d)$, valid per PKCS#1 and most implementations of RSA, such that $c=m^e\bmod N$ and $m=c^d\bmod N$.

In a nutshell: we'll choose primes $p$ and $q$ so that $p-1$ and $q-1$ are suitably smooth, find $u=e\bmod(p-1)$ and $v=e\bmod(q-1)$ by solving Discrete Logarithm Problems thus made relatively easy, then combine $u$ and $v$ into $e$ using the Chinese Remainder Theorem with moduli $p-1$ and $q-1$.

We'll navigate around a number of possible pitfalls:

  • The DLP finding $u$ such that $m^u\equiv c\pmod p$ must have a solution. We'll insure this by keeping $p$ only when $m$ is a generator of $\Bbb Z_p^*$; same for finding $v$ such that $m^v\equiv c\pmod q$.
  • $e$ must be odd, that is $u$ must be odd, so we'll generate $p$ so that $c$ is not a quadratic residue modulo $p$; same for $q$.
  • $\gcd(e,p-1)=1$ must hold. We'll insure this by rejecting $p$ when $\gcd(u,p-1)\ne1$; same for $q$.
  • The system of equations $e=u\pmod{p-1}$ and $e=v\pmod{q-1}$ has solutions (to be found by the CRT) subject to the condition $u\equiv v\pmod{\gcd(p-1,q-1)}$. We'll insure this by constructively generating $q$ so that $\gcd(p-1,q-1)=2$.

The algorithm goes:

  1. Decide appropriate intervals for $p$ and $q$. We want $p\,q>\max(m,c)$, $p>3$, $q>3$.
  2. Construct a prime $p$ in the desired interval
    • as $p=2\,r+1$ with $r=\prod r_i$ where $r_i<b$ are odd primes below bound $b$ (say $b=2^{20}$).
    • with $c^r\bmod p=p-1$, also $m^r\bmod p=p-1$ and $m^{(p-1)/r_i}\bmod p\ne1$ for each distinct $r_i$ (change one prime $r_i$ or its multiplicity to make another prime $p$ if that does not hold). Notice that the first two conditions imply that $p$ pass the strong pseudoprime test for base $c$ and $m$, and can thus be the first line of primality testing for $p$.
  3. Find $u\in\big[1,p\big)$ with $c\equiv m^u\pmod p$, using the Pohlig-Hellman algorithm. If $\gcd(u,r)\ne1$, retry at 2. Notes:
    • Pohlig-Hellman will be acceptably fast thanks to the moderate $b$ even with Pollard's rho or baby-step/giant-step to solve the DLP each $r_i$.
    • The way we selected $p$ insures that there is precisely one solution $u$ (since $m\bmod p$ is a generator of the multiplicative group $\Bbb Z_p^*$), and that $u$ is odd (since $c\bmod p$ is a quadratic non-residue modulo $p$).
    • The test $\gcd(u,r)\ne1$ will insure that $\gcd(e,p-1)=1$ ultimately holds. It rarely fails, and setting a minimum for the $r_i$ of step 2 helps lower the probability of that.
  4. Construct a prime $q$ in the desired interval (possibly adjusted per $p$)
    • as $q=2\,s+1$ with $s=\prod s_i$ where $s_i<b$ are odd primes below $b$ and $\gcd(r,s_i)=1$ (insuring that $\gcd(p-1,q-1)=2$).
    • with $c^s\bmod q=q-1$, also $m^s\bmod q=q-1$ and $m^{(q-1)/s_i}\bmod q\ne1$ for each $s_i$ (change one prime $s_i$ or its multiplicity to make another prime $q$ if that does not hold).
  5. Find $v\in\big[1,q\big)$ with $c\equiv m^v\pmod q$, as in 3. If $\gcd(v,s)\ne1$, retry at 4.
  6. Compute public exponent $e\in\big[0,(p-1)(q-1)\big)$ with $u=e\bmod(p-1)$ and $v=e\bmod(q-1)$ per the Crt, e.g. as $e=(((q-1)^{-1}\bmod r)(u-v)\bmod r)\,(q-1)+v$. Notes:
    • By construction of $e$, it holds $c\equiv m^e\pmod p$ and $c\equiv m^e\pmod q$.
    • $p$ and $q$ are coprime, thus $c\equiv m^e\pmod{p\,q}$
    • $c<p\,q$, thus $c=m^e\bmod(p\,q)$, much to our satisfaction!
    • $\gcd(e,p-1)=1$ and $\gcd(e,q-1)=1$ always hold, thanks to tests at 3 and 5. In particular, $e$ is odd.
    • $m<p\,q$ and $m\ne c$, thus $e\ne1$, thus $e\ge3$ since $e$ is odd.
    • $e<(p-1)\,(q-1)$, thus $e<p\,q$, as required.
  7. Compute public modulus $N=p\,q$, a private exponent $d$ (the smallest possible one is $d=e^{-1}\bmod((p-1)(q-1)/2)$ ), and if needed other private key parameters $d_p=d\bmod p$, $d_q=d\bmod q$, $q_\text{inv}=q^{-1}\bmod p$ as usual.

This it feasible for all common modulus sizes. The outcome should be accepted by most RSA implementations that do not enforce an upper limit on $e$.

Try it online in Python 3, solving this challenge for $k=64$ (512-bit modulus) in few seconds.

If we additionally wanted the modulus to resist factorization, I only see that we need to randomize the choice of $r_i$ and $s_i$, a larger $b$, and that the two largest prime factors of each of $r$ and $s$ are close enough to $b$, say $b>r_0>r_1>b/2$ and $b>s_0>s_1>b/2$. The later is in order to resist some amount of Pollard's p-1. $b=2^{48}$ should prevent casual attacks. For larger/safer $b$, using a faster algorithm such as index calculus would be useful to solve the DLP within Pohlig-Hellman.

I don't see how the idea could be adapted for implementations that enforce an upper limit on $e$ (e.g. $e<2^{32}$ which used to be the case in a Windows API and sometime remains enforced by some software, or $e<2^{256}$ as in FIPS 186-4).

fgrieu
  • 140,762
  • 12
  • 307
  • 587