6

This question comes from an issue raised in another question: Non interactive threshold signature without bilinear pairing (is it possible)?

Is the proposed random oracle model safe when trying to output a distinct and random $m \times G = M$ value?

Doing the interpolation for $t$ compromised shares $m^{'}_i$ results in: $l_0 \times M_0 + \sum^t_{i=1} l_i \cdot m^{'}_i \times G = m \times G$ that reduces to $(m - \sum^{t}_{i=1} l_i \cdot m^{'}_i) \cdot l^{-1}_0 \times G = M_0$, where $M_0$ is always different for each signature. So, I suppose we can't reuse previous values to perform the attack.

How do you solve to a wanted $m$ value without resolving the DLP? Searching for $m^{'}_i$ and $m$ for some unknown $m_0$ is brute forcing the DLP, even in the k-sums context!

What I have seen in the k-sums/generalized birthday problem is a way to solve for $x_1 \oplus ... \oplus x_n = 0$. Mapping this approach to our problem, we should try to solve for $x_1 \oplus ... \oplus x_n = m_0$ equivalent to $x_1 \oplus ... \oplus x_n \oplus m_0 = 0$. The issue is, $m_0$ has a specific value but it is unknown to the solver due to DLP. How can we solve for something we don't know? If such solution were possible, won't this be solving the DLP?

I need a math clarification to explain exactly how this attack is performed?

Edited1: Expanded math proof: Trying to follow @Aman Grewal logic, lets try to attack in a k-sum scenario.

All variables marked in the form $c^*$ are controlled by the attacker. The attacker's objective is to sign a random $B^*$ for a submitted $B$ such that $B^* \neq B$. The attacker has access to $M_0$ and $c=H(Y||M||B)$ for this or any previous messages. Assume the attacker has knowledge of $t$ shares of $y_i$.

We remove the Lagrange coefficients $l_i$ from the math, since they are public and doesn't affect the final proof. For a single signature we have:

  1. For a set of randomly selected $m_i^* \times G = M_i^*$ one can derive $\sum_{i=1}^t M_i^* + M_0 = M^*$
  2. Then $c^* = H(Y||M^*||B^*)$ and the output of a single signature is $(m_0 + c \cdot y_0) + \sum_{i=1}^t (m_i^* + c_i^* \cdot y_i) = m^* + c^* \cdot y$. Assuming $m_0 + \sum_{i=1}^t m_i^* = m^*$ and $c + \sum_{i=1}^t c_i^* = c^*$ (this last one is not totaly correct, since we removed the Lagrange coefficients, but this is even easier to attack)

One cannot solve for $c_i^*$ in $\sum_{i=1}^t (m_i^* + c_i^* \cdot y_i) = (m^* + c^* \cdot y) - (m_0 + c \cdot y_0)$. Even assuming that $m^*$ is equal to some previous result and that $c^*$ is directly dependent on $c_i^*$. There are $t + 3$ unknowns corresponding to $(c_i^*, y_0, y, m_0)$. So... lets expand it to $j$ signatures:

The real equation we need to solve is: $\sum_{j=1}^n \sum_{i=1}^t (m_{ij}^* + c_{ij}^* \cdot y_i) = \sum_{j=1}^n [(m_j^* + c_j^* \cdot y) - (m_{0j} + c_j \cdot y_0)]$

Assuming somehow you can have a lot of equalities in this system of equations between signatures $j$, you are still left with $(t + 2) + j$ unknowns for $(c_i^*, y_0, y, m_{0j})$. For every new equation, you have a new unknown $m_{0j}$ that you can't catch up. $m_{0j}$ is distinct for every new signature by the definition of the threat model.

Edited2: Eq public version: The public version of the equation is: $\sum_{j=1}^n \sum_{i=1}^t (M_{ij}^* + c_{ij}^* \cdot Y_i) = \sum_{j=1}^n [(M_j^* + c_j^* \cdot Y) - (M_{0j} + c_j \cdot Y_0)]$

In this case there are only the $c_{ij}^*$ unknowns, but we have the DLP. If there is an efficient way to solve this, are we breaking the DLP?

If any one can contest this math logic to come up with a successful attack, I will accept your answer.

shumy
  • 418
  • 3
  • 10
  • It seems like two or more attackers can coordinate, wait until all of the parts are received, and then generate 2 (or more) lists of points to select a specific M that satisfies the sums within the group. They still cannot gain access to m0, or solve for m0 without breaking the DLP, but they can select a specific and weak M. – Erik Aronesty Feb 25 '20 at 15:39
  • And, would that really matter? Even if the combined sum is 1, it is as weak as $M_0$. If you strongly produce any $M_n$, the result is still strong. And I believe this applies to other threshold schemes as well. – shumy Feb 26 '20 at 10:03
  • It does matter, because it's an M that they know the solution for. So they still don't know m0 (and cannot derive it), but they know "little m", which is not safe. – Erik Aronesty Mar 09 '20 at 14:33
  • How can they know $m$ without $m_0$ if $l_0 \cdot m_0 + \sum_{i=1}^{t} l_i \cdot m_i^{'} = m$ ? And with $m \times G = M$ you need to break the DLP. – shumy Mar 09 '20 at 14:50
  • Seems to me if that was safe, then you could just roll random numbers for m-i - what is the benefit of using a random oracle? – Erik Aronesty Mar 09 '20 at 16:26
  • @ErikAronesty Yes, and that crossed my mind. As long as you assure that honest nodes provide random $m_i$ with no repetitions you should be safe. It doesn't have to be the way it is defined wright now. – shumy Mar 09 '20 at 16:36
  • So then why is everyone using pederson's DKG or precommitments or other complex systems for producing random shared values. Why not just roll random numbers and trust the DLP? OR ... conversely, why not just solve the DLP by rolling random lists of points. Seems like one or the other must be true. Either a simple DKG with a threshold of honest parties is fine.... or the DLP has been broken for some time. – Erik Aronesty Mar 09 '20 at 16:48
  • @ErikAronesty "solve the DLP by rolling random lists of points" - That would be a bummer! I will exclude that one. "a simple DKG with a threshold of honest parties is fine" - You don't know who is honest, and you need a minimum of $t + 1$ to recover the correct values. "why is everyone using ... precommitments" - in general the schemes are different, but... that is what I'm trying to find out. Sometimes the simplest solutions evades the most. I'm not discarding that this has a problem, but I need concrete attacks to my proofs. – shumy Mar 09 '20 at 17:08

1 Answers1

1

Attackers can choose their $M_0, m_0$ pair without solving DLP.

In particular, they generate multiple lists of $M, m$ pairs and try to solve for $l_1 \cdot M_1 + l_2 \cdot M_2 + ... + l_n \cdot M_n = M_0$.

In order to solve this, they no longer have $m_i$ for some $i$. The k-sums algorithm is effectively solving $l_1 \cdot M_1 + l_2 \cdot M_2 + ... + l_n \cdot (M_n - M_0) = 0$.

In this way, they can choose the final value ($M_0$), but are unable to recover anyone else's private values (an $m_i$ that the attacker doesn't own).

This shouldn't an issue for signatures (when computing the nonce) because the signature can never be computed without all the $m_i$. But it won't be secure for other applications.

However, there's another attack, presented in section 4 of https://eprint.iacr.org/2018/417.pdf. This attack relies on multiple parallel signature operations. Note that this attack still works with different messages even though it is only presented with the same message.

Suppose that $i$ indexes the participants and $j$ indexes the messages so that $M_j$ refers to the nonce of the jth message and $M_{ji}$ refers to ith participant's public value used to interpolate for $M_j$. In this attack, the attacker is searching for $M_{ji}$ and $a$ such that $a \cdot \Sigma H(Y||M_j||B) = H(Y||M||B^*)$, where $B^*$ is the message they want to sign.

The generalized birthday attack provides a somewhat efficient way to solve for these $j+1$ unknowns. For example, with a 256-bit hash and 127 parallel signatures, the equation can be solved in $O(2^{47})$, which is significantly less than the complexity of breaking the hash or solving the discrete log.

Aman Grewal
  • 1,421
  • 1
  • 9
  • 23
  • What is the point of $H(Y||M_j||B)$ ? The hash is not homomorphic! You may just reduce this to $a \cdot \sum c^{'} = H(Y||M||B^{*})$. But the attack is imcomplete; the result should be $l_1 \cdot y_1 \cdot c + \sum l_i \cdot y_i \cdot (a \cdot c^{'}) = c \cdot y$. With 2 unknowns, how can you find $a \cdot c^{'}$ ? – shumy Mar 03 '20 at 10:21
  • So, the correct equation should be $\sum_j [a_j \cdot H(Y||M_j||B)] = H(Y||M||B^{*})$. You can solve it, but how can this be used to forge a signature? Trying to reuse past signatures, I assume the attack is like this $\sum_j a_j \cdot (m_j + c_j \cdot y) = \sum_j (a_j \cdot m_j) + c \cdot y$. It's very unlikely that you can solve $\sum_j a_j \cdot m_j = m$ or $\sum_j a_j \times M_j = M$ ! – shumy Mar 04 '20 at 11:17
  • 1
    Disregard my previous comment. The point of including $M_j$ is because you can choose $M_j$, which influences the hash. I updated the post to include the cost of solving the problem. – Aman Grewal Mar 04 '20 at 22:41
  • My last point is still valid. I can't accept the response as it is. – shumy Mar 06 '20 at 10:43
  • If the attacker can choose M0, for which they know the private value, how does that not solve the DLP? – Erik Aronesty Mar 09 '20 at 16:58
  • @ErikAronesty The answer is incorrect (by the definition of my question), attackers cannot choose $M_0$. $m_0$ is selected by the honest party (not accessible by attackers), there's nothing to solve here. The correct k-sum is $l_0 \times M_0 + \sum_{i=1}^{t} l_i \times M_i^{'} = M$ for a known $M$. But you cannot recover the respective $m_i$ or $m$. The attackers need the corresponding $m_i$ to sign for that $M$. – shumy Mar 09 '20 at 17:23
  • @shumy the attacker can choose M0 if they control 2 or more nodes by using K-sums on the points and knowing that the honest party will accept their responses as values in the equation. They can wait for all honest parties to reply (they will know every Mi), then select values to control the final M0. – Erik Aronesty Mar 17 '20 at 18:09