Is using a cofactor to find a base point only for performance reasons?

Question

For elliptic curve cryptography, the procedure to find a base point that generates a subgroup with order $n$ is:

Calculate the order $N$ of the elliptic curve (using Schoof's)
Choose $n$. $n$ must be prime and a divisor of $N$
Compute cofactor $h = \frac{N}{n}$
Choose a random point $P$ on the curve
Compute $G = hP$
If $G$ is 0, then go back to step 4. Otherwise, you've found a generator with order $n$ and cofactor $h$

Is the purpose of the cofactor here solely to increase the efficiency in finding a large sub-group?

I suppose if you didn't use a co-factor and instead tried to brute-force compute whether the random point, $P$, was a generator for a sub-group of size $n$ you would have to do $n$ iterations, which would be impossible on modern computers. But, I want to confirm.

Edit: My last paragraph is wrong b/c we can use repeated squaring to calculate $G = nP$

Dupe of How to find the order of a generator on an elliptic curve? — kelalaka, Jan 19 '22 at 23:55
@kelalaka, your linked Q&A is not really a duplicate, as it does not discuss the performance reasons at all. The main thrust of this question here is not •how• to do it, but why: performance alone or performance+otherReasons. — Andreas ZUERCHER, Jan 20 '22 at 15:09

score 3 · Accepted Answer · answered Jan 19 '22 at 23:23

3

Is the purpose of the cofactor here solely to increase the efficiency in finding a large sub-group?

Well, this alternate algorithm would work (assuming $n$ is prime; actually, both algorithms assume $n$ is prime):

Choose a random point P on the curve (other than the point at infinity)
Compute $G = nP$
If $G \ne 0$, go back to step 4. Otherwise, you've found a generator with order $n$ and cofactor $h$

That algorithm would work; however it is obviously much less efficient; partly because computing $nP$ will be considerably more expensive than computing $hP$ (as we usually have $n \ggg h$), and also in the modified version, you'll take an expected $h$ iterations before finding a point, while in the original algorithm, it's an expected $1 + 1/n$ iterations (that is, the test at the end almost never fails).

answered Jan 19 '22 at 23:23

poncho

147,019
11
229
360

Thanks for the response. Can you explain your last sentence in more detail? (Is modified referring to the version that uses the cofactor or the slower algorithm)? – Foobar Jan 19 '22 at 23:27
Also, with regards to iterations. I'm assuming you mean there is a new iteration each time we guess a new random point, $P$. Given that under both algorithms the point $P$ is chosen randomly, why would there be less guesses in the original algorithm? – Foobar Jan 19 '22 at 23:30
@Roymunson After picking a random point that satisfies the curve equation, an iteration of your algorithm will succeed $N-h$ out of $N$ times (i.e. almost certainly), and poncho's alternate algorithm (if you consider the iteration to begin at the time of choosing a random point but before the infinity check) will with each iteration succeed only $n-1$ out of $N$ times (i.e. approx. $1$ out of $h$ times). The advantage of poncho's algorithm is that it won't exclude any valid possibilities – knaccc Jan 20 '22 at 01:51
@knaccc: actually, the original algorithm won't exclude any valid possibilities either; in fact, it'll generate all possible generators with equal probability – poncho Jan 20 '22 at 03:37
@poncho thanks for pointing that out. Now I think about it, that does make perfect sense. – knaccc Jan 20 '22 at 05:06

kelalaka · Answer 2 · 2022-01-21T10:27:27.527

This is an empirical result to complement Poncho's answer;

Take Curve25519 which has a cofactor $8$ as the order $n$ of the group factors as

$\small{n = 2^3 * 7237005577332262213973186563042994240857116359379907606001950938285454250989}$

$h = 8 $
$q = n/8$

We use SageMath and SageMath random_element function in which it may return the identity element $\mathcal{O}$ of the curve ( the of chance getting it is negligible) , on Curve25519 $\mathcal{O} = (0:1:0)$ on Weierstrass form.

import time
def randomBasePointByCofactor(E,identity,cofactor):
s = time.time()
ci = 0
n = E.order()
for i in range(1,10000):
    P = E.random_element()
    if cofactor*P != identity:
        ci = ci +1
e = time.time()
print(&quot;time elapsed on randomBasePointByCofactor&quot;, e-s)
return (ci)


def randomBasePointByOrder(E,identity,cofactor):
s = time.time()
ci = 0
n = Integer(E.order() / cofactor)
for i in range(1,10000):
    P = E.random_element()
    if n*P == identity:
        ci = ci +1

e = time.time()
print(&quot;time elapsed on randomBasePointByOrder&quot;, e-s)
return (ci)    


p = 0x7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffed
K = GF(p)
A = K(0x76d06)
B = K(0x01)
E = EllipticCurve(K, ((3 - A^2)/(3 * B^2), (2 * A^3 - 9 * A)/(27 * B^3)))
IP = E((0,1,0))
Bound = 10000
print(" number of found generators =" , randomBasePointByCofactor(E,IP,8), "/", Bound)
print(" number of found generators =", randomBasePointByOrder(E,IP,8),"/", Bound)

A sample result is

time elapsed on randomBasePointByCofactor 1.9164307117462158
 number of found generators = 9999 / 10000
time elapsed on randomBasePointByOrder 64.77565383911133
 number of found generators = 1267 / 10000

Therefore

the cofactor method is faster ~32 times faster in the experiments.

We can explain this in simple terms as; $8$ requires 4 doublings and 1 addition, whereas $n$ requires 251 doublings and 125 addition with naive double-and-algorithm. This gives ~75 times more calculations if we assume doubling and additions have the same speed which they are not.
the cofactor method produces more generators than the order method since the $1/8$ of the random elements from $8\cdot q$ falls into the large prime $q$ of the Curve25519.

Therefore, the cofactor method is preferable.

Thank you, this script was extremely helpful for seeing the theory in action. — Foobar, Jan 26 '22 at 00:24
That was the reason that one need to implement to see. Even Paul Erdös had problem on Monty Hall problem until they saw computer simulation.. — kelalaka, Jan 26 '22 at 10:22

Is using a cofactor to find a base point only for performance reasons?

2 Answers2