I'm trying to implement multiple precision arithmetic operations modulo P
, with P < 2^256
.
More specifically, P = 2^256 - 2^32 - 977
.
I want to support the following operations: +
, -
, *
, /
, pow
(each mod P)
As P is close to 2^256
, numbers are represented with 8 u32 or 4 u64.
a + b mod P
can be done like this (in pseudo code):
n = a + b
if overflow: # i.e. over 2^256
# add 2^256 - P to come back modulo P
n += 2**32 + 977
else:
if n >= P:
# P <= n <= 2^256
n -= P
--
For a * b mod P
, my first intention was to simply do a long multiplication but that seems slow as I would need the carry to be 256 bits as well.
Are there any recommended algorithms to calculate a * b modulo P
efficiently (using arrays of u32 / u64)?
I'm mostly interested in the multiplication because:
a^x mod P
can be an optimized version ofa * a * ... * a mod P
a / b mod P
can be calculated asa * b^{P-2}
using fermats little theorem
Note: Bitcoin implements these operations with numbers represented with 10 x uint26 instead of 8 uint32 so each "digit" keeps 6 bits but I'm not familiar with their methods.
Kochanski seems like a good fit but there is little detail on the algo to be honest
– Ervadac Jun 25 '21 at 13:24