Key Derivation Functions vs. Password Hashing Schemes

Question

Key derivation functions, such as HKDF (standardized in RFC 5869), are meant to stretch some initial keying material having enough entropy, like a Diffie-Hellman shared value, into one or more strong cryptographic secret keys.

Password-hashing schemes, such as the PHC winner Argon2, are meant to hash usually low-entropy passwords with the goal of making the hash digest inversion as costly as possible for an adversary with respect to CPU and memory consumption, as well as parallelization.

Is it exact to consider that password-hashing schemes are actually key derivation functions specialized for low-entropy inputs? Or is there any other essential difference of theoretical nature between these two types of cryptographic schemes?

There are two kinds of KDF, the slow, strengthening kind fed by a password(e.g. PBKDF2), and a fast one that only derives secondary keys from a master key(e.g. HMAC). The second one obviously shouldn't be used for password hashing. — CodesInChaos, Oct 11 '12 at 21:42
so an HMAC is a KDF? how is an HMAC then different from HKDF? — Finlay Weber, Nov 11 '21 at 02:07

score 17 · Answer 1 · edited Oct 02 '20 at 21:44

17

Key-stretching key derivation functions must produce results that have certain randomness properties, and be very difficult to reverse. Password hashes only need to satisfy the property "difficult to reverse", without randomness requirements. This is why all key-stretching key derivation functions work as password hashes but not the other way around.

Note that there are also key derivation functions that are non-stretching. Stretching functions are inherently slow, and this is necessary for password hashing. Fast key derivation functions such as HKDF are not suitable when the input has a low entropy, for example a password, regardless of whether the goal is to derive key material or a password hash.

edited Oct 02 '20 at 21:44

Gilles 'SO- stop being evil'

19,134
4
50
92

answered Oct 11 '12 at 23:02

Vitaly Osipov

429
3
6

This is completely backwards. KDFs and password hashes both have randomness and irreversibility requirements. Password hashes, sometimes called password-based KDFs, additionally have parameters to scale how costly they are to evaluate. – Squeamish Ossifrage May 24 '19 at 16:01
@SqueamishOssifrage In this answer, “KDF” meant “KDF that does stretching”, because it was originally written on another question that was solely about key-stretching KDF. That merge was unfortunate, since this answer needed to be rewritten to match the newer question. – Gilles 'SO- stop being evil' Oct 02 '20 at 21:46

Squeamish Ossifrage · Accepted Answer · 2019-05-21T17:27:04.753

8

A key derivation function does a few things:

Turn a random bit string with high min-entropy,^* initial key material, into an effectively uniform random bit string.
Label the parts of the resulting uniform bit string by purpose for reproducible derivation.
Prevent multi-target attacks from saving a factor of $n$ cost in attacking one of $n$ targets with an optional salt.

Often, parts (1) and (3) are done separately from part (2) in an extract/expand form, as in, e.g., $\operatorname{HKDF-Extract}(\mathit{salt}, \mathit{ikm})$ which turns a high min-entropy initial key material $\mathit{ikm}$ into an effectively uniform random master key $\mathit{prk}$ with an optional salt, and $\operatorname{HKDF-Expand}(\mathit{prk}, \mathit{info}, \mathit{noctets})$ which derives effectively independent subkeys from a uniform random master key $\mathit{prk}$ labeled by the $\mathit{info}$ parameter. If you already have a uniform random master key to start, you can skip HKDF-Extract and pass it directly on to HKDF-Expand.

A password hash serves one additional purpose:

Cost a lot to evaluate—in time, memory, and parallelism.

This way, even if we can't control the expected number of guesses to find a password, we can control the cost of testing each guess to drive up the expected cost of finding a password.

Specifically, password hashes usually do parts (1), (3), and (4), leaving the reproducible labeled derivation of subkeys in (2) to functions like HKDF-Expand. For example, it can actually hurt to use PBKDF2 to generate more than a single block of output, so you should absolutely use HKDF-Expand to turn a single master key from PBKDF2 into many subkeys. That said, this particular pathology is fixed in Argon2, but HKDF-Expand may still be more convenient for labeling the subkeys by purpose.

Summary:

If you have a high min-entropy but nonuniform secret like a Diffie–Hellman shared secret, then use HKDF-Extract.
If you have a low min-entropy secret like a password, use Argon2.

Then pass the resulting effectively uniform master key you get out of them through HKDF-Expand to derive subkeys for labeled purposes.

_{^* The min-entropy of a procedure for making a choice is a measure of the highest probability of any outcome; specifically, if, among a finite space of (say) passwords chosen by some procedure, the probability of the $i^{\mathit{th}}$ password is $p_i$, the min-entropy of the procedure is $-\max_i \log_2 p_i$ bits. If there procedure is to choose uniformly at random from $n$ options, the min-entropy of this procedure is simply $\log_2 n$. For example, the diceware procedure with ten words has $\log_2 7776^{10} \approx 129.2$ bits of min-entropy.}

edited May 21 '19 at 17:27

answered May 21 '19 at 15:20

Squeamish Ossifrage

48,392
3
116
223

So, you essentially answered my question by "yes". That said, if you look at the respective designs of HKDF and Argon2, the former requires a PRF for the extraction step, while the latter contents itself with a hash function. So, clearly, the extraction phases in both designs do not seem to target the same security requirements. This is something that had not caught my eye during the PHC period. – cryptopathe May 21 '19 at 16:53
Another point that puzzles me is that this answer suggests to use Argon2 only as a low-min-entropy extractor, but not as an expander, although Argon2 offer the possibility to generate as many output bytes as necessary. – cryptopathe May 21 '19 at 16:59
HKDF-Extract relies on more than just PRF security of, e.g., HMAC-SHA256, because neither input (salt, IKM) is necessarily uniformly distributed; HKDF-Expand relies only on PRF security. And yes, you can use Argon2 to produce as many output bytes as you want, but it doesn't provide convenient labeling for fast reproducible subkey derivation as a separate step. – Squeamish Ossifrage May 21 '19 at 17:01
Your first sentence is misleading: the HKDF paper (Lemma 2) proves that a PRF is a sufficiently good randomness extractor, so it's incorrect to say that one needs "more than just PRF security", since PRF security is sufficient, but maybe not necessary. – cryptopathe May 21 '19 at 17:11
1

A PRF may serve as a randomness extractor, but that's not sufficient to justify the security of, e.g., HKDF-Extract(salt, DH(...)), where neither input—not the merely distinct salt, not the nonuniform DH secret—is uniformly distributed and therefore neither input satisfies the criteria of a PRF key. – Squeamish Ossifrage May 21 '19 at 17:13
Also Lemma 2 of the HKDF paper is about random oracles, not about PRFs, which is exactly the point. – Squeamish Ossifrage May 21 '19 at 17:19
According to the HKDF paper, the salt plays the role of the PRF randomness in the extraction phase and is supposed to be "random". There is even a whole section (§3.1) in the RFC that discusses this point and mentions that "even a salt value of less quality (shorter in size or with limited entropy) may still make a significant contribution to the security of the output keying material". So, you are free to claim that Krawczyk proof of HKDF security is wrong, but I will not follow you on that track ;-) – cryptopathe May 21 '19 at 17:24
If you have a uniform random secret salt, then it satisfies the criteria of the PRF. But if you had that, you wouldn't need HKDF-Extract! So while that may technically satisfy the criteria of some theorem justifying security, it doesn't justify security of HKDF-Extract(salt='', DH(...)). The RFC doesn't discuss PRFs at all, because PRF security is not relevant to HKDF-Extract. Neither does the paper discuss PRF security for HKDF-Extract; it is relevant only to HKDF-Expand. The main practical purpose of the salt is to mitigate multi-target attacks. – Squeamish Ossifrage May 21 '19 at 17:33
OK, got your point, that is actually well discussed in Section 5 of the HKDF paper. Of course, you didn't claim that Krawczyk proof is incorrect, I'd like to apologize about my wording in the above comment. So, to summarize, publicly-salted-HMAC is actually emulating a (computational) random oracle in the extract phase. Thank you much for your patience and the explanations! – cryptopathe May 21 '19 at 17:46
OK, so I merged the questions, but please next time answer the old question; it doesn't make too much sense to me to close an old question in favor of the new one. – Maarten Bodewes May 26 '19 at 15:45
This explains the difference between a fast KDF and a key stretching KDF, but not the difference between a key stretching KDF and a password hashing function. – Gilles 'SO- stop being evil' Oct 02 '20 at 21:47

Key Derivation Functions vs. Password Hashing Schemes

2 Answers2

Linked

Related