3

For elliptic curve cryptography, the procedure to find a base point that generates a subgroup with order $n$ is:

  1. Calculate the order $N$ of the elliptic curve (using Schoof's)
  2. Choose $n$. $n$ must be prime and a divisor of $N$
  3. Compute cofactor $h = \frac{N}{n}$
  4. Choose a random point $P$ on the curve
  5. Compute $G = hP$
  6. If $G$ is 0, then go back to step 4. Otherwise, you've found a generator with order $n$ and cofactor $h$

Source

Is the purpose of the cofactor here solely to increase the efficiency in finding a large sub-group?

I suppose if you didn't use a co-factor and instead tried to brute-force compute whether the random point, $P$, was a generator for a sub-group of size $n$ you would have to do $n$ iterations, which would be impossible on modern computers. But, I want to confirm.

Edit: My last paragraph is wrong b/c we can use repeated squaring to calculate $G = nP$

kelalaka
  • 48,443
  • 11
  • 116
  • 196
Foobar
  • 247
  • 1
  • 7

2 Answers2

3

Is the purpose of the cofactor here solely to increase the efficiency in finding a large sub-group?

Well, this alternate algorithm would work (assuming $n$ is prime; actually, both algorithms assume $n$ is prime):

  1. Choose a random point P on the curve (other than the point at infinity)

  2. Compute $G = nP$

  3. If $G \ne 0$, go back to step 4. Otherwise, you've found a generator with order $n$ and cofactor $h$

That algorithm would work; however it is obviously much less efficient; partly because computing $nP$ will be considerably more expensive than computing $hP$ (as we usually have $n \ggg h$), and also in the modified version, you'll take an expected $h$ iterations before finding a point, while in the original algorithm, it's an expected $1 + 1/n$ iterations (that is, the test at the end almost never fails).

poncho
  • 147,019
  • 11
  • 229
  • 360
  • Thanks for the response. Can you explain your last sentence in more detail? (Is modified referring to the version that uses the cofactor or the slower algorithm)? – Foobar Jan 19 '22 at 23:27
  • Also, with regards to iterations. I'm assuming you mean there is a new iteration each time we guess a new random point, $P$. Given that under both algorithms the point $P$ is chosen randomly, why would there be less guesses in the original algorithm? – Foobar Jan 19 '22 at 23:30
  • @Roymunson After picking a random point that satisfies the curve equation, an iteration of your algorithm will succeed $N-h$ out of $N$ times (i.e. almost certainly), and poncho's alternate algorithm (if you consider the iteration to begin at the time of choosing a random point but before the infinity check) will with each iteration succeed only $n-1$ out of $N$ times (i.e. approx. $1$ out of $h$ times). The advantage of poncho's algorithm is that it won't exclude any valid possibilities – knaccc Jan 20 '22 at 01:51
  • @knaccc: actually, the original algorithm won't exclude any valid possibilities either; in fact, it'll generate all possible generators with equal probability – poncho Jan 20 '22 at 03:37
  • @poncho thanks for pointing that out. Now I think about it, that does make perfect sense. – knaccc Jan 20 '22 at 05:06
1

This is an empirical result to complement Poncho's answer;

Take Curve25519 which has a cofactor $8$ as the order $n$ of the group factors as

$\small{n = 2^3 * 7237005577332262213973186563042994240857116359379907606001950938285454250989}$

  • $h = 8 $
  • $q = n/8$

We use SageMath and SageMath random_element function in which it may return the identity element $\mathcal{O}$ of the curve ( the of chance getting it is negligible) , on Curve25519 $\mathcal{O} = (0:1:0)$ on Weierstrass form.

import time

def randomBasePointByCofactor(E,identity,cofactor):

s = time.time()
ci = 0
n = E.order()
for i in range(1,10000):
    P = E.random_element()
    if cofactor*P != identity:
        ci = ci +1
e = time.time()
print("time elapsed on randomBasePointByCofactor", e-s)
return (ci)

def randomBasePointByOrder(E,identity,cofactor):

s = time.time()
ci = 0
n = Integer(E.order() / cofactor)
for i in range(1,10000):
    P = E.random_element()
    if n*P == identity:
        ci = ci +1

e = time.time()
print("time elapsed on randomBasePointByOrder", e-s)
return (ci)    

p = 0x7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffed K = GF(p) A = K(0x76d06) B = K(0x01) E = EllipticCurve(K, ((3 - A^2)/(3 * B^2), (2 * A^3 - 9 * A)/(27 * B^3)))

IP = E((0,1,0)) Bound = 10000

print(" number of found generators =" , randomBasePointByCofactor(E,IP,8), "/", Bound)

print(" number of found generators =", randomBasePointByOrder(E,IP,8),"/", Bound)

A sample result is

time elapsed on randomBasePointByCofactor 1.9164307117462158
 number of found generators = 9999 / 10000
time elapsed on randomBasePointByOrder 64.77565383911133
 number of found generators = 1267 / 10000

Therefore

  • the cofactor method is faster ~32 times faster in the experiments.

    We can explain this in simple terms as; $8$ requires 4 doublings and 1 addition, whereas $n$ requires 251 doublings and 125 addition with naive double-and-algorithm. This gives ~75 times more calculations if we assume doubling and additions have the same speed which they are not.

  • the cofactor method produces more generators than the order method since the $1/8$ of the random elements from $8\cdot q$ falls into the large prime $q$ of the Curve25519.

Therefore, the cofactor method is preferable.

kelalaka
  • 48,443
  • 11
  • 116
  • 196