2

I am new to cryptography and encountered the discrete log problem. Given generator $g$, a prime $p$ and an integer $b$ calculate $x$ such that:

$g^x \equiv b\mod{p}$

I have read that such a problem has been solved for a $240$ digit prime. So I am thinking how much time would be required to solve the following problem:

$p = 2^{256} - 2^{32} - 2^9 - 2^8 - 2^7 - 2^6 - 2^4 - 1$

$g = 5$

$b = 106295707471159281529219833497106423491956844928470854873943432222392415336112$

We need to find $x$ such that:

$5^x \equiv b \mod{p}$

The answer to x is:

$x = 6262672636363683838373783537373356788$

Additionally, Factor of $p - 1$ are:

$2×3×7×13441×205115282021455665897114700593932402728804164701536103180137503955397371$

esteregg
  • 23
  • 3
  • 1
    https://en.wikipedia.org/wiki/Discrete_logarithm_records – kelalaka Dec 20 '23 at 12:24
  • @kelalaka I know about record. But my question is how long and which algorithm? – esteregg Dec 20 '23 at 12:25
  • 1
    @esteregg "how long and which algorithm", apart from these two, you also need to specify the hardware for an actual estimate. And to make it useful, you better ask it in terms of a single-core CPU, so that we can extrapolate the answer for other hardwares that others may be interested in. – DannyNiu Dec 20 '23 at 12:35
  • The effort and best method depend on characteristics of $p$ beyond being a prime of a given size. E.g. the problem is easy if $p-1$ happens to have no large prime factor, which is not totally unlikely. It's also possible that a poor choice of $g$ makes the problem easier. And as pointed in above comment, time depends on what computing mean is used. – fgrieu Dec 20 '23 at 12:36
  • @DannyNiu i5-1235U is the CPU. – esteregg Dec 20 '23 at 12:38
  • @fgrieu Sorry I am not asking for any specific $p$ I am asking generally. But let us assume it is a safe prime. – esteregg Dec 20 '23 at 13:07
  • Why don't you just start from small example to estimate timing? Well i did it for RSA for long time ago. – kelalaka Dec 20 '23 at 13:27
  • @kelalaka Good point. I have added example – esteregg Dec 20 '23 at 14:06
  • 1
    Takes around 10 minutes with CADO-NFS on my laptop. – Samuel Neves Dec 21 '23 at 06:06

1 Answers1

2

The effort and best method to solve for $x$ the Discrete Logarithm Problem $g^x \equiv b\pmod p$ for prime $p$ depends on characteristics of $p$ beyond being a prime of a given size, and of what we know from or can guess from $x$, possibly by some tests on $b$.

In the first part of this answer we consider only prime $p$ of 256-bit order of magnitude, as in revision 4 of the question, and assume $x$, $g$ and $b$ are unremarkable.

In particular, the size of the highest prime factor $q$ of $p-1$ is critical to applicability of the Pohlig–Hellman algorithm, which generally has cost dominated by roughly $2\sqrt q$ modular multiplications (with Baby-Step/Giant Step to find $x\bmod q$, a little more with Pollard's rho). The probability that the highest prime factor of a random prime $p$ is less than $q$ is roughly $1/\rho(\ln(p-1)/ln(q))$ where $\rho$ is the Dickman function. E.g $\rho(4)=0.0049\ldots$ tells e.g. there is roughly one chance in 200 that for random 256-bit prime $p$ there is a factor of $p-1$ that's less than 64 bit, which would make Pohlig–Hellman worth consideration. And because it's relatively easy to screen $p$ for the condition, if we wanted to break one of thousands DLPs $g^x\equiv b\mod{p}$ with random $p$, we could pick the one on which we concentrate our efforts.

If $p$ is a safe prime or if otherwise $p-1$ does not have a particularly small largest prime factor, and the $x$ to be found has no special characteristic (e.g. being small), Pohlig–Hellman is of no much help, and above some limit (and most probably for larger than 128-bit $p$), the algorithms of choice become Index Calculus, then the DLP variant of the Number Field Sieve.

For general $p$, cost of GNFS (the algorithm used in the 795-bit record) is (see L-notation) $$\exp\Biggl(\left(\sqrt[3]{\frac{64}9}+o(1)\right)(\ln p)^{\frac13}(\ln\ln p)^{\frac23}\Biggr)=L_p\left[\frac13,\sqrt[3]{\frac{64}9}\,\right]$$ For some rare $p$ (including I think the 78-digit safe primes $2^{256}-36113$ and $2^{256}+230191$), SNFS is applicable, with cost $L_p\left[\frac13,\sqrt[3]{\frac{32}9}\,\right]$. Index Calculus has cost $L_p\left[\frac12,\sqrt2\,\right]$, and is much easier to code.

Ignoring the $o(1)$ because we lack data about it, here is a plot of the base-2 logarithm of these quantities according to bit size of $p$. Be aware that at the very least, curves are offset vertically by a considerable amount, and data on the left especially unreliable, with Index Calculus in a better position than depicted. Difficulty of IC, GNFS, SNFS

For GNFS, we get that 256-bit $p$ is approximately $2^{24.6}$ (25 million) times easier than the record 795-bit. This is to be taken with a ton of salt: it could as well be 10 million or 1 million. Still, MUCH easier. So instead of 3000 core⋅year, we are talking few core⋅hour. The effort will be dominated by getting the code running. And it's possible that Index Calculus is better from this standpoint.


The question is for $p$ the largest prime less that $2^{256}-2^{32}$, used as the field order in secp256k1 for reasons discussed here. $p-1$ has a large prime factor, thus Pohlig–Hellman would not be useful for a random $x$. However, our $x$ is enormously smaller than would be expected for a random $x\in[1,p)$: we have a 123-bit $x$, which has probability $<2^{-133}$, thus this can't be accidental.

So we tackle the problem: knowing the 256-bit $p$, the rather typical factorization of $p-1$, $g=5$, $b=g^x\bmod p$, and that $0<x<2^{123}$ (or some similar upper bound that can't occur by chance), how do we find $x$?

Here, the problem is such that $g$ is a generator; that is, for each prime $r_i$ dividing $p-1$, it holds $g^{(p-1)/r)}\ne1$. This means we can find $x$ modulo each $r_i$.

For each $i$, we can find $x_i=x\bmod r_i$ by computing $g_i=g^{((p-1)/q_i)}\bmod p$, $b_i=b^{((p-1)/q_i)}\bmod p$, which are such that ${g_i}^x\bmod p=b_i$. By Fermat's Little Theorem, $g^{p-1}\bmod p=1$, thus ${g_i}^{x_i}\bmod p=b_i$.

For small $r_i$ (here $r_0=2$, $r_1=3$, $r_2=7$), we can find $x_i$ by trying at most $r_i$ values, with one modular multiplication each. For medium $r_i$ (here $r_3=13441$), be have choice between enumeration which is workable, and Baby Step/Giant Step which reduces the core of the search to about $2\lceil\sqrt{r_i}\rceil=232$ modular multiplications.

So we find $x_i=x\bmod r_i$ (here $x_0=0$, $x_1=1$, $x_2=3$, $x_3=2861$) for small $r_i$ dividing $p-1$ (here $r_0=2$, $r_1=3$, $r_2=7$, $r_3=13441$). By the Chinese Reminder Theorem that gives us the value of $x\bmod(r_0\,r_1\,r_2\,r_3)$, that is $x\bmod564522=70066$. We can now define the (unknown) $x'$ such that $x=564522\,x'+70066$, compute $g'=g^{564522}\bmod p$ and $b'=g^{-70066}b\bmod p$, and we have reduced our problem to finding ${g'}^{x'}\bmod p=b'$ with a much smaller $x'$ (reduced from 123 to 104 bits). See this for extension to $p-1$ having smalls primes with multiplicity in its factorization.

Now, if we had enough memory and computing power, we could use Baby Step/Giant Step again, which would use in the order of $2^{104/2+1}$ modular multiplications modulo $p$. With 256-bit argument broken into $k=4$ (64-bit) computer words, and for small $k$, a multiplication costs $k^2=16$ word multiplications and additions with carry, and a modular reduction a little more (here $p$ has a special form that helps, I'll ignore that), so I guestimate $2^{10\pm4}$ clock cycles per modular multiplication modulo $p$. So we are talking $2^{63\pm4}$ clock cycles ignoring memory issues on a CPU with 64x64->128 multiplier. Pollard's rho solves the memory issues (at the cost of some more work), and eases parallelization. But still $2^{63\pm4}$ cycles is decades.

So we'll have to use SNFS (rather than GNFS, because I think we can, because $p$ is close to a power of a small prime), or perhaps Index Calculus. I think that SNFS won't be able to take advantage of our smaller $x'$. If Index Calculus can, which I do not rule out, it could be the algorithm of choice. I do not know, and asked.

fgrieu
  • 140,762
  • 12
  • 307
  • 587
  • 1
    To be honest, I was expecting some experimental results rather than displaying theoretical results. – kelalaka Dec 20 '23 at 16:39
  • Also, I had thought that, for factorization, that quadratic sieve was actually more efficient for numbers of this size. One thing I don't know, is the QS algorithm applicable to the DLog problem (and if so, does its relative efficiency still hold)? – poncho Dec 20 '23 at 17:06
  • @poncho: I do not know either, and for sure never met report of a DLOG problem addressed with QS [update: or does that goes under the name Index Calculus?]. While we are at wondering, do you know if Index Calculus would be faster than NFS in the situation, taking into account both $p\approx2^{256}-2^{32}$, and the 123-bit $x$ reduced to 104-bit using Pohlig–Hellman? – fgrieu Dec 20 '23 at 18:00
  • Actuall the $x$ in reality is large (a $77$ digit number. My mentor presented this problem to me to show how hard it is to calculate discrete logs. I just created my own $g$ and $x$. That means in reality when length of $g$ and $x$ would be relatively large to $p$ the algorithm of choice would be SNFS? – esteregg Dec 20 '23 at 18:10
  • @esteregg: yes, for the $p$ at hand (I think, not sure, that $p$ is close enough to $2^{256}$ that SNFS can work and we do not need GNFS) and arbitrary $x$. And I'm mildly confident the order of magnitude of the runtime is few hours at worse with a good implementation. – fgrieu Dec 20 '23 at 18:11
  • 1
    @fgrieu That is what I was thinking. With GNFS or SNFS we can easily solve DLP within few hours to days for prime less than $100$ digits. This implies RSA schemes with less than $100$ digit prime are not safe. Since with elliptic curve schemes there is no GNFS or Index Calculus method they are safe with prime less than $100$ digits. – esteregg Dec 20 '23 at 18:14
  • SNFS for factorization has been used extensively in the Cunningham project, and the expertise accumulated there could help answering your question. Whatever, your estimate above is true enough to justify using ECC, or much larger $p$ for the DLP modulo $p$, and for RSA (which is a different problem). – fgrieu Dec 20 '23 at 18:18
  • Can you give a back of the envelope argument that says $2^{38}$ complexity (my reading of the graph) for 256 bit $p$ using GNFS translates to "core hour"s. Or point me to a source? – kodlu Dec 21 '23 at 17:49
  • 2
    @kodlu: for 256 bits, and baring an error in my transcription, the L-notation formula (and the graph) give for the base-2 log: 61.8 for IC, 46.7 for GNFS, 37.0 for SNFS (which perhaps works for the $p$ at hand). For 795 bits, it's 77.7 for GNFS, which according to this is 152+2400+625 core⋅year. Direct extrapolation yields 46 core⋅second, but is much likely way optimistic, so I say few hours. I wish I had a closer reference point with a modern implementation of NFS for DLP, but I do not. – fgrieu Dec 21 '23 at 18:13
  • thanks, that's exactly the sort of source I was looking for--even if a bit dated. – kodlu Dec 21 '23 at 18:17