Why when I perform RSA 5 times do I get my original input back out?

Question

If I perform RSA on my input 5 times, I get the original input back out.
Why is it that often, iterating RSA encryption just a few times cycles yields back the original value, when the public modulus N is small?

Welcome to Cryptography. Instead of a mystical question, could you provide your public key $(n,e)$ and the message as an example? — kelalaka, May 19 '20 at 22:42
Almost certainly, you are using toy parameters (tiny $n$ value); that doesn't happen with realistic key sizes — poncho, May 20 '20 at 02:58

fgrieu · Answer 1 · 2020-05-21T22:44:35.833

I read the question as:

Why is it that often, iterating RSA encryption just a few times cycles back to the original value, when the public modulus is $N$ is small?

With $N$ square-free and $\gcd(e,\varphi(n))=1$, textbook RSA encryption $$\begin{align}E: [0,N)&\to[0,N)\\ x&\mapsto x^e\bmod N\end{align}$$ is a bijection, equivalently a permutation of the set $[0,N)$.

By the pigeonhole principle, cycling any permutation of a finite set will loop back to the starting point after a number of iterations bounded by the set size $N$. It is easy to show that for a random permutation and a random starting point, probability of cycling on or before $i$ steps is exactly $i/N$.

But if for example we take $e=37$ and $n=13333=67\cdot199$ then indeed we observe that starting from most $x$ we cycle after just $5$ iterations (and for some $x$, e.g. $x=937$, that's down to $1$). This calls for explanation.

Restricting to $N=p\,q$, define $E_p(x)=E(x)\bmod p=x^e\bmod p$, and $E_q(x)=E(x)\bmod q=x^e\bmod q$. By the Chinese Remainder Theorem, the cycle length of $E$ starting from $x$ is the Least Common Multiple of the cycle length of $E_p$ starting from $x\bmod p$, and the cycle length of $E_q$ starting from $x\bmod q$.

When cycling $E_p$ for $k$ iterations starting from $x$, we reach $\displaystyle x^{\left(e^k\bmod\left(p-1\right)\right)}\bmod p$ (by Fermat's little theorem). It follows that whatever the starting $x$, we always are back (though not necessarily first back) to the starting point after making a number of iterations equal to the order of $e$ in the multiplicative group $\Bbb Z_{p-1}^*$ (notice that $\gcd(e,\varphi(n)=1$ insures $\gcd(e,p-1)=1$ and thus that $e$ belongs to that group).

In our example $e=37$, $p=67$, $q=199$, and it happens that $e^5\bmod(p-1)=1$ and $e^5\bmod(q-1)=1$, hence both $E_p$ and $E_q$ cycle after $5$ steps (or just $1$, depending on $x$), and $E$ inherits that property.

But why are short cycle relatively common for moderate $N$?

Define the decomposition of $p-1$ (resp. $q-1$) into prime factors to be $p-1=\prod p_i^{\alpha_i}$ (resp. $q-1=\prod q_i^{\beta_i}$). The order of any element of $\Bbb Z_{p-1}^*$ divides the Least Common Multiple of the $(p_i-1)\,p_i^{\alpha_i-1}$. Thus when iterating $E$, we always are back (though not necessarily first back) to the starting point after making a number of steps equal to the Least Common Multiple $\ell(N)$ of the $(p_i-1)\,p_i^{\alpha_i-1}$ and $(q_i-1)\,q_i^{\beta_i-1}$. I'm still trying to track the name of this $\ell(N)=\operatorname{lcm}(\lambda(p-1),\lambda(q-1))$ (where $\lambda$ is the Carmichael function used for computing the lowest possible private exponent $d=e^{-1}\bmod\lambda(N)$ in RSA, but here applied to $p-1$ and $q-1$).

Even without a name, it can be computed and graphed. The $p_i-1$ and $q_i-1$ are composite (or $1$), which increases the potential for common factors. Due to this effect, $\ell(N)$ often ends up with sizably less bits than $N$. For a particular $e$, some further factors of $\ell(N)$ may disappear. This explains the phenomenon.

In our example $p-1=66=2\cdot3\cdot11$, $q-1=198=2\cdot3^2\cdot11$, thus $\lambda(p-1)=2\cdot5$, $\lambda(q-1)=2\cdot3\cdot5$, hence $\ell(N)=2\cdot3\cdot5=30$, and for any $e$ the maximum cycle length must be a divisor of that. It happens that the choice of $e$ further reduces that maximum cycle length to $5$.

Being more rigorous is possible. But a convincing meta-argument that we need not worry for cycles in RSA encryption from a security perspective when $N$ is in the thousands bits, is that if we could, that would be a great way to factor $N$; but experience shows that's not.

More precisely: if we could find one $x$ and compute enough $x_{i+1}=E(x_i)$ starting from $x_0=x$ that we reach¹ $x_k=x$ for $k>1$, then computing $\gcd(N,x-x_i)$ would² have factored $N$ much before: when $i$ reached the point when $E_p$ or $E_q$ first cycled. That's a passable factoring algorithm, but³ it's less efficient than GNFS or ECM, and even Pollard's rho.

Reference: Ronald L. Rivest, Robert D. Silverman, Are ’Strong’ primes needed for RSA?.

¹ Thus deciphering $x$ by taking $x_{k-1}$. That would be a threat to RSA encryption: the so-called cycling attack.

² With overwhelming likelihood, since there's no reason³ $E_p$ and $E_q$ first cycle simultaneously.

³ Under heuristic grounds well verified in experimental factorization.

Why when I perform RSA 5 times do I get my original input back out?

1 Answers1

Linked

Related