1

the gnfs is the most efficient algorithm for factoring numbers formed of equal composites.

But it’s sequential/Linea Algebra parts mean (If I’m not wrong), that it requires at least 10 minutes on current hardware to factor semi‑primes formed of composites of equal length (in the case of 382 bits semi‑primes).

Are there less efficient algorithms but by being more parallelizable, that would allow to solve batch of such semi‑primes faster using more resources ? (the batch is sequential : a number needs to be factored by a participant to know the next number to solve)…

For more information, here’s in pseudocode how the semiprime needs to be chosen by the person who submits both the semiprime/factors (the real program contains code so that the semiprime is chosen efficiently and that factoring is the issue) :

uint1024 w = hash(common_challenge_to_participants);

//Check that |common_challenge_to_participants| <= \tilde{n} = 16 * |n|_2. uint64_t abs_offset = (chosen_wOffset > 0) ? common_challenge_to_participants.wOffset : -common_challenge_to_participants.wOffset;

if (abs_offset > 16 * common_challenge_to_participants.common_target_nBits) { //where target_nbits is between 350 to 400 LogPrintf("invalid wOffset\n"); return false; }

//Get the semiprime from random seed mpz_t n, W; mpz_init(n); mpz_init(W); mpz_import(W, 16, -1, 8, 0, 0, w.u64_begin()); //cast w to W

//Add the offset to w to find the semiprime submitted: n = w + offset if (common_challenge_to_participants.wOffset >= 0) { mpz_add_ui(n, W, abs_offset); } else { mpz_sub_ui(n, W, abs_offset); }

//Clear memory for W. mpz_clear(W);

//Check the number n has nBits if (mpz_sizeinbase(n, 2) != common_challenge_to_participants.common_target_nBits) { LogPrintf("invalid nBits"); mpz_clear(n); return false; }

//Divide the factor submitted by N mpz_t nP1, nP2; mpz_init(nP1); mpz_init(nP2); mpz_import(nP1, 16, -1, 8, 0, 0, common_challenge_to_participants.nP1.u64_begin()); mpz_tdiv_q(nP2, n, nP1); //Check the bitsizes are as expected const uint16_t nP1_bitsize = mpz_sizeinbase(nP1, 2); const uint16_t expected_bitsize = (common_challenge_to_participants.common_target_nBits >> 1) + (common_challenge_to_participants.common_target_nBits & 1);

if (nP1_bitsize != expected_bitsize) { LogPrintf("nP1 expected bitsize=%s, actual size=%s\n", nP1_bitsize, expected_bitsize); mpz_clear(n); mpz_clear(nP1); mpz_clear(nP2); return false; }

//Check nP1 is a factor mpz_t n_check; mpz_init(n_check); mpz_mul(n_check, nP1, nP2);

//Check that nP1*nP2 == n. if (mpz_cmp(n_check, n) != 0) { LogPrintf("nP1 does not divide N. N=%s nP1=%s\n", mpz_get_str(NULL, 10, n), mpz_get_str(NULL, 10, nP1)); mpz_clear(n); mpz_clear(nP1); mpz_clear(nP2); mpz_clear(n_check); return false; }

//Check that nP1 <= nP2. if (mpz_cmp(nP1, nP2) > 0) { LogPrintf("error: nP1 must be the smallest factor. N=%s nP1=%s\n", mpz_get_str(NULL, 10, n), mpz_get_str(NULL, 10, nP1)); mpz_clear(n); mpz_clear(nP1); mpz_clear(nP2); mpz_clear(n_check); return false; }

//Clear memory mpz_clear(n); mpz_clear(n_check);

//Test nP1 and nP2 for primality. int is_nP1_prime = mpz_probab_prime_p(nP1, params.MillerRabinRounds); int is_nP2_prime = mpz_probab_prime_p(nP2, params.MillerRabinRounds);

//Clear memory mpz_clear(nP1); mpz_clear(nP2);

//Check they are both prime if (is_nP1_prime == 0 || is_nP2_prime == 0) { LogPrintf("At least 1 composite factor found, rejected.\n"); return false; }

return true;

  • Update : I know about the other question. But My problem is doing it in a specific very short timeframe.
user2284570
  • 210
  • 2
  • 11
  • The aim is part as a small competition where the reward goes to the first responder. The fact the record is still 9 minutes for 382 bits semi‑prime seems to tell there’s no solution : but there’s currently only 3 participants. – user2284570 Mar 08 '24 at 16:27
  • 3
    At least for numbers 512-bit and larger, GNFS spends most of it's time sieving, and that step at least is highly paralelizable, and is routinely run on many machines. Thus the "it’s sequential" part of the question is dubious. – fgrieu Mar 08 '24 at 17:24
  • 2
    IIRC, the Quadratic Field Sieve is supposed to be more efficient for the numbers of that size; NFS becomes more efficient for larger composites... – poncho Mar 08 '24 at 17:58
  • Many comments have been moved to chat, where exchanges can continue to improve the question. – fgrieu Mar 11 '24 at 14:47

2 Answers2

2

One thought is the highly parallelisable elliptic curve method. There is a page of the largest prime factors found using ECM and the corresponding composite. By both metrics, the problem size seems to fit in range.

ETA 20240313: Given the data provided by Samuel Neves below, it seems technically possible to factor such numbers using ECM in 40,000 4 GHz core minutes with a minimum run time of one minute. After that the question becomes one of economics. You might investigate how much 3000 or so Raspberry Pi 5 cards ($4\times$ 2.4Ghz) cost when bought in bulk and amortise that upfront cost over the number of instances that you win less the running cost of the rig. I'd suggest not considering FPGAs as the cosenter link description heret of the elliptic curve method is dominated by high-precision multiplies which benefit from specialised circuitry on CPUs. Your number of wins may be bounded by how long it take someone to buy a bigger setup.

Note that the expected time between solutions will be exponentially distributed, but this is not affected by the introduction of a new target number if you fail to win a given instance.

Daniel S
  • 23,716
  • 1
  • 29
  • 67
  • I don't see that this page mentions the time and the machine they used. – kelalaka Mar 08 '24 at 21:48
  • 4
    ECM is not beating NFS for these parameters; you need an expected ~40000 runs with $B1=110000000$ to find a 191-bit factor, where each run takes ~1 minute at ~4 GHz, which is only faster if you have an incredibly high amount of parallelism. – Samuel Neves Mar 09 '24 at 01:11
  • @SamuelNeves Thanks for the data. Depending on the nature of the competition, I don't feel that this is necessarily a high amount of parallelism. If I get 2000 entrants with 4 CPUs each, I would expect a run to find a winner if they choose their curves independently. For comparison, I've heard estimates of over 3.5M CPUs being used for Bitcoin. – Daniel S Mar 09 '24 at 07:55
  • @DanielS I need to achieve each number in the batch for less than the per $60 semi‑primes reward. eᴄm is available on gᴘu. – user2284570 Mar 09 '24 at 17:16
  • @SamuelNeves does it simplify anything the smallest factor has a known bitlength ? – user2284570 Mar 12 '24 at 15:06
  • 2
    The ECM running time is a function of the factor size, not the number to be factored. If you know that the factor is relatively small, ECM may be faster than the QS/NFS. The balanced semi-prime case is the worst case for ECM. – Samuel Neves Mar 12 '24 at 19:51
  • @SamuelNeves and in my case, the smallest factor has to be at least half the size of the number to factored… – user2284570 Mar 17 '24 at 15:36
-2

I don’t fully understand how to do it but mseive can perform the linear Algebra/krylov part on several ɢᴘᴜ which solve the sequential problem : the remaining is in theory to just add more cores.

Just the matter to modify ᴄᴀᴅᴏ‑ɴꜰꜱ default script.

However, turns out you have to find the correct number as a result of a hash : so the real effort is to factor something like 20,000 such numbers on that period and not just 1 : won’t happen for less $100…

user2284570
  • 210
  • 2
  • 11