Speed up modular exponentation with fixed base and modulus

Question

Can someone explain, how $a^x \mod N$ can be speeded up, when $a$ and $N$ are known constants? How big is the gain and what resources are needed?

https://www.imperialviolet.org/2013/05/10/fastercurve25519.html

Just to mention: it can speed up SRP hashes bruteforce, which is calculated as $v = g^x \mod N$ where $x = hash(username, salt, password)$

Rereading my answer, I have to say that Adam's blog you've linked to provides a better explanation. Perhaps it would be easier to try and help you understand that: What did you have trouble understanding? — Cryptographeur, Nov 26 '13 at 18:13
I want to know this "How big is the gain and what resources are needed?" The explanation in the blog is too big for fast understanding. — Smit Johnth, Nov 26 '13 at 18:59

score 5 · Answer 1 · edited Aug 07 '23 at 18:37

5

One obvious way is to precompute values $a^{k_1} \bmod N$, $a^{k_2} \bmod N$, ...,$a^{k_i} \bmod N$, and (depending on the value of $x$) multiply together the appropriate elements.

To take a simple example, if we precompute $a^1 \bmod N, a^2 \bmod N, a^4 \bmod N, ... a^{2^k} \bmod N$, and (based on the value of $x$ in binary, multiply the appropriate elements together); this gives a method which takes an average of $1/2 log_2 N$ multiplies, which is an obvious improvement over what you can do without any precomputation.

The paper gives a slightly more aggressive example (treating $x$ as a base 16 rather than a base 2 number).

However, you can do even better: see this paper (or the extended abstract at Eurocrypt 1992) for a survey of the various possibilities.

One note: if you are performing operations on an Elliptic Curve (that is, doing a point addition rather than a modular multiplication), then the operation of computing the inverse of an element is cheap; even though the cited paper doesn't cover that case, that can be used to reduce the number of operations even further.

edited Aug 07 '23 at 18:37

fgrieu

140,762
12
307
587

answered Nov 26 '13 at 17:47

poncho

147,019
11
229
360

AFAIU it's not $1/2 log_2 N$ but $\mathrm{pop}(x)$ where $\mathrm{pop}(x)$ is the number of $1$ bits (Hamming weight). Am I right? – Smit Johnth Nov 26 '13 at 19:06
I meant dependence between x and calculation time. – Smit Johnth Nov 26 '13 at 19:09
@SmitJohnth: yes, it is $pop(x)$; that's why I said "average" (because the average of $pop(x)$ is $1/2\log_2(x)$ – poncho Nov 26 '13 at 19:09
Ah ok. But normally it's $1,5 log2(x)$. NOt a big gain. – Smit Johnth Nov 26 '13 at 19:11
@poncho: Sorry, didn't realise you'd replied when i deleted my comment. rest of world: I was unsure as to which was 'more' aggressive. – Cryptographeur Nov 26 '13 at 19:13
@SmitJohnth: if you call a factor of 3 speed up "not a big gain", you are correct. However, if you use (say) base 16 (as the paper you sited mentioned), that's $1/4 \log_2(x)$ time; that's a factor of 6 speed up. Are we getting close to what you would call "a big gain"? – poncho Nov 26 '13 at 19:13
Originally there was a discussion if SRP needs slow hash function for password hashing or is modexp slow enough. x6 is probably not big enough to say modexp on modulus with safe length (at least 1kbit) doesn't slow down enough. Can it be made faster? – Smit Johnth Nov 26 '13 at 19:19

Cryptographeur · Answer 2 · 2013-11-26T18:17:15.853

In brief: If you know $(a,N)$, you can speed the computation up by precomputing some of the powers of $a$.

Let $x=x_n\dots x_1x_0=\sum_{i=0}^n x_i 2^i$ be the binary expansion of $x$, and let $a_j=a^{2^j}\pmod N$.

Very naively: $$ a^x \pmod N = \overbrace{a*(a*(a*\dots*(a))\dots))}^{\text{x terms}} $$ This requires $\theta(x)$ multiplications.

Traditional Square and Multiply: $$ a^x \pmod N = a^{x_n \dots x_0} = a^{2^n x_n+ \dots+ x_0} =(a^{2^n})^{x_n} (a^{2^{n-1}})^{x_{n-1}} \dots a^{2^0})^{x_n} =\prod_{i=0}^n a_i^{x_i} $$ So, to use this for efficient multiplication, we maintain a product value $y$, and exponentiation variable $e$ - initialised with $(y,e)=(1,a)$. Then, we simply continue to square $e$, multiplying $y$ by $e$ each time we reach some power $a^{2^j}$ for which $x_j= 1$. How much work does this require? Well, we must calculate $n=\log_2(x)$ multiplications to calculate the $a_j$, and then on average $n/2$ multiplications to calculate $a^x$ (where we assume that an "average" value of $x$ has half it's bits set), and at most $n$ multiplications. Total? $2n$ worst case, $\frac{3}{2}n$ on average.

Precomputational Optimisations of Square and Multiply

Calculating each $a_j=a^{2^j}\pmod N$ in advance, when we calculate $a^x$ we only need to do the (at most) $n$ multiplications. That is, we do not need to do any exponentiations at all. However, if $x$ may be very large, this will involve storing a large amount of data, but by storing some subset of these $j$ we can reduce the number of exponentiations required to reach the remaining values. Moreover, if one so wished this could store values such as $b=a_4*a_2$, which would reduce the online cost of calculating $a^{1010b}$ to the cost of looking up $b=a^{1010b}$.

Deciding a balance for this trade-off provides an interesting question, since at some point storing too many powers becomes unreasonable. For example, it would be possible to precompute and store $a^x\pmod N$ for all $x\in\{0,\dots,2^t\}$. This would reduce calculating $a^x$ to the look-up cost, but such a table would have size $\theta(2^t)$, which may well be impractically large.

Maybe I've used bad notation, but when I was taught asymptotics $\theta(\cdot)$ was the version for 'grows like' (contrasting to $O(\cdot)$ for 'upper bounded by' which need not be anywhere near tight). — Cryptographeur, Nov 26 '13 at 19:08
The simpliest lookup method (table for every $2^n$) give 3x speedup, compared to this, every $x$ speed gain results in $2^x$ table growth (e.g. 8x speed gain needs 256 times bigger table), right? — Smit Johnth, Nov 27 '13 at 14:06
For 3x speedup (precomputing $a^{2^n}$, online calculation of $a^x$), the table will grow as $\log(x)$, since we need to store a value of $a_j$ for each bit in the length of $x$. I'm not sure I understand the second half of your question [sorry :( ] — Cryptographeur, Nov 27 '13 at 14:29
Ah right. Sorry, I'm not able to put a decent (ie close) bound on that. — Cryptographeur, Nov 27 '13 at 14:48

Speed up modular exponentation with fixed base and modulus

2 Answers2

Linked

Related