Information leak in ElGamal encryption with message in base group

Question

Assume a finite commutative base group $\mathbb B$, some $g$ in $\mathbb B$, and $\mathbb{G} = \langle g\rangle$ the subgroup that $g$ generates, with choice of $\mathbb B$ and $g$ such that ElGamal encryption would be secure for a random message in $\mathbb G$ (semantically/under CPA).

ElGamal encryption still allows decryption if we use a random message in $\mathbb B$ rather than in $\mathbb G$. How much information about the message can leak if we do this? If necessary to get a result, restrict to $\mathbb B=\mathbb Z^*_p$, perhaps with $p$ an odd prime, and/or to $\mathbb G$ of prime order.

Notation: order of $\mathbb G$ generated by $g\in \mathbb B$ (noted multiplicatively) is $q$; private key $x$ is uniformly random with $0<x<q$; public key is $h=g^x$. Encryption of $m$ random in the message space ($\mathbb G$ for the standard definition of ElGamal encryption, $\mathbb B$ in the question) generates one-time $y$ uniformly random with $0<y<q$, and computes $c_1=g^y$, $s=h^y=g^{x\,y}$, $c_2=m\,s$. Ciphertext is $(c_1,c_2)$. Decryption recomputes $m$ as $c_1^{q-x}\,c_2$.

It is known that if $\mathbb B=\mathbb Z^*_p$ with $p$ a large prime and $\mathbb G$ of prime order $(p-1)/2$, there's an information leak of one bit: we can determine from the ciphertext if the plaintext is a quadratic residue or not; but nothing else as far as we know. I'm unsure about the many other cases ($\mathbb G$ of smaller prime order, or not of prime order; other base groups).

My motivation for that problem came while answering this question.

What do you mean by "semantically, though not under CPA"? Isn't it the same thing? — Daniel, Jan 23 '18 at 23:42
One could avoid randomness with $c_2^q = m^q$, and exactly this much an adversary would learn. Did you mean something more specific? — Vadym Fedyukovych, Jan 24 '18 at 23:56
@Vadym Fedyukovych: without randomness, encryption would be deterministic, enabling Choosen Plaintext Attack. I'm considering standard ElGmal, with a per-message randomness $y$, with a single relaxation: message $m$ is in the base group $\mathbb B$, rather than the subgroup $\mathbb G$. — fgrieu, Jan 25 '18 at 07:03
@fgrieu For any $y$, $(h^y)^q = 1$, so I was saying like avoiding per-message randomness. With the same idea, for any message from the subgroup $\mathbb G$, $m^q = 1$ so no leaking. For messages from base group $m^q \ne 1$ facilitating Chosen Plaintext as you said. — Vadym Fedyukovych, Jan 25 '18 at 22:28

user94293 · Answer 1 · 2018-01-25T19:24:34.900

Let a prime $p = 2qs + 1$. Consider the subgroup $\mathbb{G} = \langle g \rangle$ of order $q$. Public key is $h = g^x \bmod p$ while private key is $x \in \mathbb{Z}_q$. The encryption of $m$ is given by $(c_1, c_2)$ with $c_1 = g^r \bmod p$ and $c_2 = m \cdot h^r \bmod p$ for a random integer $r \gets \mathbb{Z}_q$. Semantic security mandates that message $m$ belongs to $\mathbb{G}$.

Let $z$ be a generator of $\mathbb{F}_p^*$. So we can write $m = z^M \bmod p$ for some $M \in \mathbb{Z}_{p-1}$.

Raising $c_2$ to the power of $qs$ yields $${c_2}^{qs} \equiv z^{Mqs \bmod (p-1)} \equiv (z^{qs})^{M\bmod 2} \equiv (-1)^{M\bmod 2} \pmod p$$ The value of $(M \bmod 2)$ indicates whether or not message $m$ is a square in $\mathbb{F}_p^*$.
Raising $c_2$ to the power of $2q$ yields $${c_2}^{2q} \equiv z^{M2q \bmod (p-1)} \equiv (z^{2q})^{M\bmod s} \pmod p$$ If $s$ is small or smooth, an attacker can recover the value of $(M \bmod s)$ as the discrete logarithm in $\mathbb{F}_p^*$ of ${c_2}^{2q}$ with respect to base $z^{2q}$.

Let a cryptographic function $H \colon \mathbb{G} \to \mathbb{F}_p^*$ viewed as a random oracle. Semantic security can be met with message space $\mathbb{F}_{p}^*$ by defining the ciphertext as the pair $(c_1, c_2)$ with $c_1 = g^r \bmod p$ and $c_2 = [m \cdot H(h^r \bmod p)] \bmod p$ for a random integer $r \gets \mathbb{Z}_q$. Decryption of $(c_1, c_2)$ is obtained as $m = [c_2 /H({c_1}^x \bmod p)] \bmod p$.

Another option to get semantic security without random oracles is to take $s = 1$ ($p = 2q+1$ is a safe prime). The set of valid messages is restricted to $\mathcal{M} = \{1, \dotsc, (p-1)/2\}$. The encryption of a message $m \in \mathcal{M}$ is given by the pair $(c_1, c_2)$ with $c_1 = g^r \bmod p$ and $c_2 = m^2 \cdot h^r \bmod p$ for a random integer $r \gets \mathbb{Z}_q$. Decryption of $(c_1, c_2)$ is obtained in two steps as $m^2 = c_2 /{c_1}^x \bmod p$ and then $m$ as the square root (modulo $p$) of $m^2$ in the set $\mathcal{M}$. Note that if $m \in \mathcal{M}$ then $-m \bmod p = p-m \notin \mathcal{M}$.

This covers $\mathbb G=\mathbb{F}_p^=\mathbb{Z}_p^$ with $p$ prime . Can we say that when $p = qs+1$ with $p$, $q$ and $s/2$ large randomly seeded primes, a single bit leaks? — fgrieu, Jan 25 '18 at 07:15
Yes, in this case a single bit would leak, assuming that DDH holds in both the subgroup of order $q$ and the subgroup of order $s$. Note that this amounts to using ElGamal over a composite order group, something which is done quite often in the literature as composite-order elliptic curves have several nice properties. — Geoffroy Couteau, Jan 25 '18 at 20:18
Let $p = 2qs + 1$. Do you mean taking $\mathbb{G} = \langle g \rangle$ of order $qs$ for large primes $q$ and $s$? Using the above methodology, you have semantic security under DDH (in a composite-order group). — user94293, Jan 25 '18 at 20:19

Geoffroy Couteau · Answer 2 · 2018-01-26T08:51:56.620

1

In general, if you use $\mathbb{Z}_p^*$ with $p = q\cdot \prod_{i=1}^t p_i + 1$, where the $p_i$ are distinct small prime numbers, then you will have $O(t)$ bits of leakage. So, intuitively, a lot of information can leak if we do this: up to $O(\log p)$ with this approach, hence up to a constant fraction of all the bits of your message. As pointed in the other answer, that's relatively easy to avoid in general though.

EDIT: as pointed by Vadym Fedyukovych in the comments, a more precise evaluation of the leakage for $p$ of the form above, including the low order terms, is $O(t\log\log p)$.

edited Jan 26 '18 at 08:51

answered Jan 25 '18 at 20:18

Geoffroy Couteau

19,919
2
46
68

1

For a small $p_i$, one would recover $(y \pmod{p_i})$ from $c_1^{\frac{p-1}{q p_i}}$ rather than a single bit, followed by $h^{\frac{p-1}{q p_i}}$ and $m^{\frac{p-1}{q p_i}}$. Looks like $\log(p_i)$ bits of $m$ to me. – Vadym Fedyukovych Jan 25 '18 at 22:47
Indeed, I oversimplified a bit by ignoring low order terms. Each $p_i$ is of size bounded by $\log p$, hence the leakage with $t$ such $p_i$ is more precisely $O(t\log\log p)$. But as $t$ can be at most $\log p / \log\log p$, this still gives a maximal leakage of a constant fraction of the bits, which is the largest possible leakage anyway. – Geoffroy Couteau Jan 26 '18 at 08:50

Information leak in ElGamal encryption with message in base group

2 Answers2

Linked