What is the function of the secret key “r” in Poly1305?

Question

Poly1305-AES uses two per-connection keys $k$ (for AES) and $r$ (for Poly1305's compression function) and a per-message nonce n to execute. I've read the original paper on the implementation, and I think I understand what it does, but not how it works. In particular, if the security of the MAC is dominated by the security of $AES_k(n)$, why must $r$ be kept secret?

In other words: what are the security properties of $Poly1305_r(m)$ by itself, before adding $AES_k(n)$?

$r$ being secret is essential. The only point of $s$ is to prevent an attacker from learning $r$. — CodesInChaos, Dec 02 '13 at 08:50

score 1 · Answer 1 · answered Dec 02 '13 at 02:52

$Poly1305_{{r,s}}(m)$ is a one-time authenticator - it can be used to authenticate only a single message with any given key $(r,s)$ without violating the security guarantees (the violation is immediate - only two authenticated messages with the same key are required to create a forgery according to the nacl docs).

There are two 128 bit key values to this function (commonly combined into a single 256 bit key):

$r$ is the basis for the polynomial evaluation (in a specific format with some bits cleared)
$s$ is a key value that is added at the end to the result of the polynomial evaluation

Both of these values are part of the key - the specification requires that the key (including $r$) is unpredictable, and the size of the set $R$ from which $r$ is sampled does factor in the security proofs, so I would presume that exposing $r$ would compromise the security guarantee to some degree. Given the $2^{106}$ bound on that security, I would treat all parts of they key as secret.

$Poly1305{-}AES_{(r,k,n)}(m)$ extends $Poly1305$ into a general purpose MAC function - i.e. it can be used to authenticate many messages with a single key $(r,k)$ as long as the nonce $n$ is not repeated.
The extension is achieved by replacing the $s$ value in the one-time authenticator with the result of $AES_k(n)$, which produces an unpredictable value for each unique nonce.

The $AES$ part of the calculation can be replaced by another secure cipher - e.g. the nacl library uses $xsalsa20$ to encrypt the nonce, but other block ciphers like Serpent/Twofish etc. will work just as well.

These things are all true, but it doesn't really answer the question. How is the security of the MAC affected by publishing either r or s, but not both? In detail, how is the hash's resistance to forgery based on the secrecy of both r and s? — Jonathan, Dec 02 '13 at 04:36

score 1 · Answer 2 · answered Dec 02 '13 at 21:07

$Poly1305_{k,r}(N,M)$ is a Carter-Wegman nonce-based MAC, whose security crucially depends on the uniqueness of nonce $N$ for every message $M$. It is defined as $$ Poly1305_{k,r}(N,M) = f(M,r) + AES_k(N), $$ where $f(M,r)$ is a polynomial of $r$ with coefficients derived from the binary representation of $M$, and $AES_k(N)$ is the encryption of nonce $N$ on key $k$.

The function $f(M,r)$ alone does not have provide any security. Given $f(M_1,r)$ and $f(M_2,r)$ for two distinct $M_1,M_2$, it is easy to recover $r$ and generate a forgery. This is even easier if $r$ is known.

Therefore, it is a uniqueness of nonce $N$ (and hence its ciphertext) that randomizes the MAC value and makes it unpredictable. The importance of $f$ comes from its speed, which is larger than that of AES, whereas the call of AES is independent of the message and its length.

I think you missed the point of the question ... which is "why must $r$ stay secret", not "why do we have a nonce?" — Paŭlo Ebermann, Dec 02 '13 at 21:21
$r$ must stay secret, because it is easy to construct collisions for $f$ if you know $r$. These collisions are forgeries for the MAC. — Dmitry Khovratovich, Dec 02 '13 at 21:41

score 1 · Answer 3 · edited Jan 03 '21 at 01:47

The relevant security property of $\operatorname{Poly1305}_r$ is that it has bounded difference probability—that is, for any distinct messages $x \ne y$ of up to $L$ bytes, and any difference $\delta$, $$\Pr[\operatorname{Poly1305}_r(x) - \operatorname{Poly1305}_r(y) = \delta] \leq 8\lceil L/16\rceil/2^{106},$$ under random choice of $r$. (Here the subtraction is modulo $2^{128}$; internally, Poly1305 works modulo $2^{130} - 5$ and limits $r$ to $2^{106}$ possibilities to enable cheap arithmetic, which accounts for the weird constant factor $8/2^{106}$.)

A forger, given a legitimate message/authenticator pair $(m, a)$ related by $a = \operatorname{Poly1305}_r(m) + s$ for unknown $r$ and $s$, who tries find a forgery $(m', a')$ with $m' \ne m$ will be thwarted with high probability for any $m'$ and $a'$ because the one-time forgery probability is bounded by the difference probability:

\begin{align*} \Pr&[a' = \operatorname{Poly1305}_r(m') + s \mid a = \operatorname{Poly1305}_r(m) + s] \\ &= \Pr[a' = \operatorname{Poly1305}_r(m') + a - \operatorname{Poly1305}_r(m)] \\ &= \Pr[\operatorname{Poly1305}_r(m') - \operatorname{Poly1305}_r(m) = a' - a] \\ &\leq 8\lceil L/16\rceil/2^{106}. \end{align*}

For NaCl crypto_secretbox_xsalsa20poly1305, the story essentially ends here—we derive an effectively independent $r$ and $s$ for each message by the PRF XSalsa20. For Poly1305-AES, the story also involves the Carter–Wegman method (paywall-free) of authenticating $n$ messages with independent random secrets $r, s_1, s_2, \dotsc, s_n$ using a universal hash family like Poly1305, and Shoup's instantiation with a block cipher like AES to derive $s_i = \operatorname{AES}_k(i)$ from a short key $k$ and unique message number $i$. (More background, history, and references.)

Why must $r$ be kept secret? With $r$ an adversary could trivially forge authenticators. For example, given the one-time authenticator $a = \operatorname{Poly1305}_r(m) + s$ on the message $m$, the adversary could compute the one-time pad $s$ used to conceal the hash $\operatorname{Poly1305}_r(m)$ by $s = a - \operatorname{Poly1305}_r(m)$, and then—with full knowledge of the authenticator keys $r$ and $s$—forge the authenticator $a' = \operatorname{Poly1305}_r(m') + s$ for any $m' \ne m$. This attack works no matter how you pick $r$ and $s$, e.g. even if $s = \operatorname{AES}_k(i)$ for some AES key $k$ and message number $i$.

What is the function of the secret key “r” in Poly1305?

3 Answers3