7

I want to combine two or more keys to create a single encryption key that relies on all of them. What is the proper method for doing that? Simple XOR? Using hash functions? Something else?

I personally used this: k = md5( key1 || key2 ). Note: || means concatenation. I used md5 because I use 128-bit encryption and thus need a 128 bit output key.

Some other questions arise here for me:

  1. Is using MD5 secure for this specific purpose?

  2. I don't know of any other standard 128-bit output hash functions. It seems newer cryptographic hash functions all have 256-bit and more output lengths. So is using another hash function that has a 256/512-bit output and then truncating the result down to 128 bits secure to an equal or more degree than using MD5?

Note that key1 and key2 are random keys, not passwords, and thus key stretching is not relevant or applicable.

If it is relevant, I generated both keys, and I know that both keys are cryptographically random. Neither key was supplied by an untrusted party. At present both keys happen to be the same length, but I'd prefer a more general solution that does not rely upon this assumption, if possible.

D.W.
  • 36,365
  • 13
  • 102
  • 187
H M
  • 283
  • 3
  • 8
  • Do the input keys have known constant length? – CodesInChaos May 02 '12 at 18:12
  • 1
    I would not use md5 for anything to be honest. There are better ways to generate a key. –  May 02 '12 at 18:50
  • 5
    There are no known MD5 weaknesses that have any practical bearing on the strength of this scheme. Still, I'd use something else just so that you don't have to keep explaining that. – David Schwartz May 02 '12 at 19:29
  • @CodeInChaos: currently yes, but maybe not in the future. And generally, it is obvious that a generic/flexible method is much much better. –  May 02 '12 at 19:29
  • @HM I don't quite understand the point... A symmetric cipher's "strength" or "security" is not just tied to the input key but other aspects as well. The strength of the cipher assumes keys are generated randomly. Using concatenation of hash values to derive a key (be it concatenation, xor'ing numerous random values, or truncating larger hashes down to a smaller size) doesn't increase the computational complexity of the cipher given a fixed key size. For example, however you chose to create a key, if you use AES ECB, you're still subject to AES implemented in ECB mode. –  May 03 '12 at 00:11
  • actually i used this in my PHP high security register and login project for encrypting session files contents on the server. for encryption, a shared key that is generated automatically at installation time plus a random key generated per each client browser session is used for encrypting client's session contents on the server (it has HMAC too). the two keys are combined using md5 to form a single key for encryption. this way if an attacker steals the session id via the server side he can't still impersonate himself as the user. –  May 03 '12 at 05:26
  • i forgot to mention that the second key is stored in a cookie at the client side. –  May 03 '12 at 05:33
  • related: http://security.stackexchange.com/a/4801/3120 – akira May 03 '12 at 10:08
  • 1
    Are both keys chosen by you, or can one be chosen by an untrusted party? Are key1 and key2 of sufficient length to avoid one being brute forced if the other is known? – MZB May 03 '12 at 23:33
  • Yes, both keys are chosen by the program and both have sufficient number of permutations (more than 128 bits of entropy). – H M May 04 '12 at 05:35

4 Answers4

6

If the keys have constant, known length, I'd concatenate them, and then apply SHA256. If they have variable length, applying some separation mechanism might be useful.

Truncating hash functions works well. If the original hash function is good, a truncated hash function has the same properties, albeit at a correspondingly lower security level. Truncating SHA-256 is certainly better than using MD5.

I recommend something like:

Truncate(SHA-256(output-size || number-of-keys || sizeof(key1) || key1 || sizeof(key2) || key2 ...), output-size) where output-size <= 256

CodesInChaos
  • 24,841
  • 2
  • 89
  • 128
2

If I understand your question correctly, you have $n$ keys: $K_0$ ... $K_{n-1}$ and you want to derive a key, $M$ such that:

  • $M$ is 128 bits (16 bytes) in size.
  • $M$ is derived using a deterministic algorithm.
  • $M$ cannot be derived without the knowledge of every $K$.

If every $K$ is 128 bits in size:

$M = K_0 \oplus K_1$ ... $\oplus K_{n-1}$

If every $K$ is smaller than 256 bits in size:

$K_x = K_x || [0x00 * (32 - length(K_x))]$

$K_x = H_{SHA-256d}(K_x)$

$K_x = truncate(K_x, 16)$

$M = K_0 \oplus K_1$ ... $\oplus K_{n-1}$

If every $K$ is larger than 256 bits in size:

$K_x = H_{SHA-256d}(K_x)$

$K_x = truncate(K_x, 16)$

$M = K_0 \oplus K_1$ ... $\oplus K_{n-1}$

Chris Smith
  • 1,192
  • 1
  • 10
  • 18
  • 2
    why do u pad k if it is smaller than 256 bits? – H M May 05 '12 at 04:33
  • While xor would work in this particular case, it is less robust -- e.g., it becomes vulnerable if any of the keys are supplied by an untrusted party. Therefore, I prefer the @CodesInChaos's method of hashing the concatenation of the keys. – D.W. Jan 10 '13 at 02:29
0

If you have a cryptographic key (K), you can split it into any number of parts (P) using an XOR.

Px = P1 XOR P2 XOR P3 XOR ... XOR Px-1 XOR K

Then destroy the key K, and distribute all parts P.

To retrieve K, just XOR P1 to Px, the result will be the original key K.

Sources:

http://www.nd.edu/~cseprog/proj02/cryptogrophy/final.pdf

http://users.telenet.be/d.rijmenants/en/secretsplitting.htm

Petey B
  • 117
  • 3
  • i don't want to split a key into several parts. i want to combine several independent keys into one. it is not secret sharing. –  May 02 '12 at 19:31
  • @H M: replace 'parts' with 'subkeys'. combining several subkeys (parts) into the one final key is exactly what you are doing right now. Petey B proposes to (sk1 ^ sk2 ^ sk3) instead of concatening the (sub)keys and then md5 the result ... – akira May 04 '12 at 05:53
-2

the problem with hash-functions such as md5 and sha* are that they were not designed to solve the problem you have: create a cryptographic hard key for further use. they were design to calculate hashes very very fast without having a too high chance of hash-collisions.

what you really want to use is a so called key derivation function. examples for such functions are:

what kind of secrets you pipe into that function to get the actual key is pretty much up to you. the main point of such functions is that it takes relatively long time to calculate the key and thus slowing down brute force attacks quite a bit.

akira
  • 97
  • 3
  • 3
    "Note that key1 and key2 are random keys, not passwords, and thus key stretching is not a relevant and applicable thing." – CodesInChaos May 03 '12 at 09:44
  • @CodeInChaos: the keystretching part is not the point of using a better kdf than "md5" ... – akira May 03 '12 at 09:48
  • 1
    keystreching is pretty much the only point of using slow KDFs over cheap hashes such as SHA-2. If the input has sufficient entropy, the slow down isn't necessary, since guessing the input is already infeasible. – CodesInChaos May 03 '12 at 09:50
  • @CodeInChaos: no. – akira May 03 '12 at 09:51
  • What benefits does a slow KDF(PBKDF2&co) provide over a fast one(plain SHA-2 if no salt is requires, HMAC-SHA-2 if a salt is required), if the input has sufficient entropy? I don't see any. – CodesInChaos May 03 '12 at 10:01
  • @CodeInChaos: read the papers. and .. my last sentence in my answer. in short: time. – akira May 03 '12 at 10:03
  • It has been a while since I read those papers, but from what I remember they talked about low entropy sources, such as passwords, where key-strengthening is required. Whereas for inputs with sufficient entropy(say 128 bit or more) deliberately slowing down the hashing offers no benefit, since the attack this protects against is already infeasible. – CodesInChaos May 03 '12 at 10:09
  • then reread them. the main point (again) is time. they make calculating / brute forcing the input hard hard hard by picking NOT fast code paths but slow ones. and mess around with the cpu cache. etc. all to slow the brute forcing down and making it thus more expensive. your argument about having totally awesome key1 and key2 parts is nil since that is irrelevant for brute-forcing md5(key1||key2). or sha2(key1||key2). – akira May 03 '12 at 10:12
  • http://crypto.stackexchange.com/questions/400/why-cant-one-implement-bcrypt-in-cuda – akira May 03 '12 at 10:25
  • 1
    @akira: What is the process for what you mean by "brute-forcing md5(key1||key2) or sha2(key1||key2)"? –  May 03 '12 at 11:10
  • the attack target here is to get the key which is the result of md5() or sha() or hash() or in general kdf(). coz that is what is used to encrypt the content. thus, you do not attack key1 or key2 but rather what falls out of kdf(). to slow down a brute force attack on that attack you need a good kdf() which makes brute forcing harder / slower. for the attacker it is irrelavent if the pair is ("foo"|"bar") or ("fo"|"obar") or any other combination. – akira May 03 '12 at 12:07
  • If one just tries each possible output (from md5() or sha() or kdf() or something else) then the time needed to compute kdf() would have no effect, since one would not be computing kdf(). –  May 03 '12 at 20:58
  • AFAIK key stretching is needed because of the fact that passwords chosen by humans almost always have much lower entropy/permutations than automatically generated random keys. In the case of automatically generated random keys with sufficient entropy/permutations (i think 128 bit strength is still enough for most ordinary purposes) the computations are already done by scientists/mathematicians that obviously must have considered brute-force attacks. isn't it so? – H M May 04 '12 at 05:53
  • @RickyDermer: mhh. true. such things as rainbow-tables etc are "just" reordering the complete list of (128-bit-)numbers to have more likely "keys" upfront for testing the crypted content against. the attacker in OP's case then has to test against all 2^128 cases anyway. – akira May 04 '12 at 06:00
  • @HM: the point was not the keystretching part. it's obvious that a good kdf() does that. the point of the kdf() functions mentioned in my answer is that they are much much much harder to compute (memory and cpu-cycles wise) than something like easy-to-implement-in-hardware-functions like md5() and sha*(). so, to use one of the kdf() from my answer is not needed in your case .. but it won't hurt (thus is not 'wrong') – akira May 04 '12 at 06:02
  • i use key stretching for password hashing. its cost is tolerable for a web site because register and login requests are only a (tiny) fraction of a web site's traffic. but using key stretching for other purposes may decrease the overall performance considerably. using 256 bit keys and encryption seems much better in this regard and results in comparable or indeed higher security. – H M May 04 '12 at 06:46
  • if u use 2^16 iterations for key stretching, u r adding just that number of bits to the security: 128+16=144, but if u use 256 bit keys and encryption, u achieve a much higher strength at a much lower cost, i think. – H M May 04 '12 at 06:56
  • @HM: if you do those 2^16 iterations with md5() then you have slowed down the attacker just a tiny bit .. just because md5() and sha*() are designed to run fast. with the above mentioned kdf() you do the keystretching as well .. but in a way that is really costly. the advantage of mentioned kdf() is not keystretching, it is the increased cost of calculating the key. and again: in your case you would use kdf() to create 'key1' and 'key2' based upon human passwords, if you create them out of a feasible pool of entropy you won't need that anymore. – akira May 04 '12 at 07:18