1

From wikipedia https://en.wikipedia.org/wiki/Digital_signature#How_they_work and other articles I see that: "There are several reasons to sign such a hash (or message digest) instead of the whole document." But they are related to RSA.

Does it make sense to follow this rule for Ed25519 signature scheme?

Squeamish Ossifrage
  • 48,392
  • 3
  • 116
  • 223
  • FYI, Ed25519 is not ECDSA. ECDSA is an archaic badly designed signature scheme that involves elliptic curve cryptography; Ed25519 is a modern well-designed signature scheme that also involves elliptic curve cryptography. See https://blog.cr.yp.to/20140323-ecdsa.html for the technical details of how they are related and different. – Squeamish Ossifrage Feb 08 '18 at 17:34
  • @SqueamishOssifrage I thought EdDSA is ECDSA with Curve25519. – Melab Feb 09 '18 at 00:58
  • 1
    EdDSA is an entirely different algorithm. It's an elliptic curve digital signature algorithm, but it bears pretty much no relation to ECDSA. ECDSA is just an overly general name for a very specific standardized algorithm. – SAI Peregrinus Feb 09 '18 at 01:39
  • @Melab Follow the link https://blog.cr.yp.to/20140323-ecdsa.html for the details! Ed25519 is EdDSA with (a twisted Edwards curve birationally equivalent to) Curve25519; ECDSA is a completely different family of cryptosystems altogether. – Squeamish Ossifrage Feb 09 '18 at 02:43

1 Answers1

4

Quick answer: You do not have to hash the inputs to Ed25519 because hashing is already part of Ed25519 itself. If you do hash inputs in advance, you become vulnerable to collisions in the hash function you use.

Longer answer:

This rule does not make sense, period, because hashing is an integral part of a secure signature scheme. It is a design principle to use a hash function to compress a message into an element of a mathematical structure that can be combined with a private key to yield a signature verifiable by the related public key. This was first described by Michael Rabin in 1979. Today, every serious signature scheme involves a hash of the message as a first step, with some variations. Any system or textbook that tries to separate the steps of ‘hashing and then signing’ is dangerous.

Here's an example of an RSA-based signature scheme: A signature on a bit string $m$ under a public key $n$, a large integer with large prime factors, is an integer $s$ such that $$s^3 \equiv H(m) \pmod n,$$ where $H$ is a random function from bit strings to positive integers below $n$.

The hash $H$ is an integral part of this signature scheme because if you didn't use it, and instead used the equation $s^3 \equiv m \pmod n$, then I could trivially forge the signature $s = 0$ on the message $m = 0$, or $s = 2$ on the message $m = 8$. The use of $H$ means I can't use the advanced technique of computing integer cubes on my pocket slide rule to forge signatures, because I don't have a hope of finding an $m$ with any prescribed hash value.

In the case of Ed25519, a signature on a bit string $m$ under a public key $A$, a point on an elliptic curve, is (roughly, with details of encoding and cofactors elided; see the Wikipedia article for more detail and references) a pair $(R, s)$ of a point $R$ on the curve and an integer $s$ such that $$[s]B = R + [H(R, A, m)]A,$$ where $B$ is a standard base point on the curve, $H$ is a random function from two curve points and a bit string to an integer below the order of $B$, and $[s]B$ etc. denote scalar multiplication on the curve.

Hashing a per-signature randomization and the public key together add some security beyond the simple RSA-based scheme above: hashing in per-signature randomization obviates the need for $H$ to be collision-resistant in practical realizations (the lapse of which in practical systems using MD5 has had serious consequences), and hashing in the public key mitigates multi-target attacks and prevents key malleability—finding two distinct keys under which a signature is valid.

Squeamish Ossifrage
  • 48,392
  • 3
  • 116
  • 223
  • How risky is to use prehasing? I found in http://ed25519.cr.yp.to/eddsa-20150704.pdf last section page 5. "The main motivation for HashEdDSA is the following storage issue" well I have the same motivation. I want to use Ed25519 to sign transactions on a hard wallet, which have memory limitations. And theoretically transactions could be very big. – Taras Shchybovyk Feb 09 '18 at 10:11
  • There are three risks that leap to mind of prehashing. 1. You become vulnerable to any problems in the prehash, e.g. collisions. 2. The hard wallet loses some opportunity to make sense of what it is signing. 3. You subject the verifier to denial of service by allowing large transactions in your protocol. If the memory constraints demand, and you can avoid all three of these risks, you might sensibly choose to prehash. But is your hard wallet so memory-constrained that it can't keep a single copy of a transaction in memory that you would want it to process? – Squeamish Ossifrage Feb 09 '18 at 15:02
  • Many systems separate the hashing step from the signature step because it's useful to separate the step that may involve a large amount of data but doesn't involve long-term secrets, from the step that uses a long-term secret but can work in a small, bounded amount of memory. That's routine when the second step is performed in a physically-protected environment (e.g. in a smartcard). Could you clarify why it's dangerous? Are you thinking of the risk that the caller won't do the hashing step? That's usually mitigated by having fixed-size input for which hashing is the path of least resistance. – Gilles 'SO- stop being evil' Feb 09 '18 at 18:10
  • @Gilles That's not separating the hashing step from the signature step. That's separating the hashing step from the fancy math step in a signature computation. By design, the fancy math step can't tell whether its input was a hash or not—so it must be abundantly clear in the wiring diagram that the the hardware device just does the fancy math to finish making a signature. And if the caller uses a fixed hash function without per-message randomization, there's a danger of collisions, like was used in an international incident of industrial sabotage by the US and Israel against Iran. – Squeamish Ossifrage Feb 09 '18 at 19:07
  • @SqueamishOssifrage Uh? What's the danger of collision if the hash function doesn't do per-message randomization? In all the standard schemes, that's the job of the “fancy math step”. – Gilles 'SO- stop being evil' Feb 09 '18 at 19:31
  • @Gilles If the fancy math only incorporates the message via $\operatorname{MD5}(m)$ like RSASSA-PSS does, then you lose. If, instead, it incorporated the message via $\operatorname{MD5}(r \mathbin\Vert m)$, where $r$ is an unpredictable per-message randomizer published, along with the fancy math output, in the signature, then it wouldn't matter that collisions in MD5 are easy to find—the aggressors in the aforementioned international incident would have had to find some other way to carry out their sabotage than forging certificates with MD5 collisions. – Squeamish Ossifrage Feb 09 '18 at 19:59
  • @SqueamishOssifrage Oh, so it's only to provide robustness in case the hash has collisions? Then I'd hardly rate the usual signature schemes (such as RSA-PSS or deterministic ECDSA) as dangerous. Extra resistance is nice, but not really the job of the signature part of the signature scheme. And it carries the burden that you can't make the signature deterministic, which has its own advantages. – Gilles 'SO- stop being evil' Feb 09 '18 at 21:42
  • @Gilles You can still make the signature scheme deterministic: pick $r$ as a pseudorandom function of $m$ under the private key. This is exactly what EdDSA does, and thus it simultaneously (a) defends against bad entropy sources at signature time—unlike DSA and ECDSA—and (b) relaxes the requirements on the hash. Note that the original RSA-PSS used something more like $H(r \mathbin\Vert m)$; when standardizing it, RSA, Inc., tweaked it to use $H(r \mathbin\Vert H(m))$ (coincidentally, around the same time as Dual_EC_DRBG)—which was actually exploited for international industrial sabotage. – Squeamish Ossifrage Feb 09 '18 at 21:53
  • @SqueamishOssifrage I don't get it. If $r$ is obtained deterministically from the message, how does it protect against attacks on the hash? Instead of forging a collision with $m$, the attacker has to forge a collision with $r||m$. What's the gain? – Gilles 'SO- stop being evil' Feb 09 '18 at 22:18
  • @Gilles The adversary cannot predict $r$ in advance because it is a secret function of $m$. So $H(r \mathbin\Vert m)$ need only be [enhanced] target-collision-resistant (which MD5 is still conjectured to be), whereas with $H(m)$ it must be collision-resistant. – Squeamish Ossifrage Feb 09 '18 at 23:41