I'm trying to understand if using non-injective encodings for Elliptic Curve ElGamal encryption is dangerous.
A standard probabilistic encoding defined by Koblitz for elliptic curves over $\mathbb{F}_p$ works roughly as follows (see this answer for details):
- Fix some value $\ell < bitlength(p)$
- Choose a random integer $r \gets \mathbb{Z}_p$
- Set $x=r∥m$, compute corresponding $y^2$, and check if it's a square. If so, take one of the two possible $y$ values and set $(x,y)$ as the encoding $m$. Otherwise, sample new $r$ and start over.
It's not constant time. So, timing attacks are possible.
It's not an injective encoding. The same message $m$ can be mapped to multiple points with different $r$, so it's not a one-to-one correspondence. However, I don't know if it's possible to exploit point collisions and encode $m||r$ to $P$ but then decode it as $m^*$.
Elligator paper justifies the importance of injective encoding via traffic inspection avoidance case. But it's not the case here - the encoded point will be encrypted.
So, does message encoding have to be injective? If so, why?
EDIT: I was wrong to assume that mapping the message to multiple points means we must have collisions. Message space is small, and the Koblitz algorithm uses a fixed length $r$, so there are no collisions. Each message can be mapped to several points, but those points do not intersect.
My confusion comes from this paper, which states that:
...one can construct a probabilistic injective encoding with equal to about half of the size of $G$, as we show in §2.4, but we do not know of provable constructions achieving a better $\ell$ in general.
I checked the Koblitz algorithm and didn't see any "half of the size of the $G$" restrictions.
Now, it seems that half of the $G$ size might be set due to error probability or the desire to guarantee that all messages up to $\ell$ bits can be encoded.
Anyway, is there any advantage (from a security point of view) of using say Elligator encoding (bijective mapping) instead of Koblitz's probabilistic one for encoding the plaintext?