In the BIP39 specification, are seeds generated from encoded or decoded passphrase?

Question

From mnemonic to seed

A user may decide to protect their mnemonic with a passphrase. If a passphrase is not present, an empty string "" is used instead.

To create a binary seed from the mnemonic, we use the PBKDF2 function with a mnemonic sentence (in UTF-8 NFKD) used as the password and the string "mnemonic" + passphrase (again in UTF-8 NFKD) used as the salt. The iteration count is set to 2048 and HMAC-SHA512 is used as the pseudo-random function. The length of the derived key is 512 bits (= 64 bytes).

I assume that (in UTF-8 NFKD) means that encoded characters are hashed, rather than the original binary data. Is this correct? If so, is there a security reason why it was done this way?

One consequence of this is that, if the seed is generated from the actual encoding rather than the original bit sequence, then one cannot encode the same bit sequence into multiple languages to give access to the same private keys. This is a pretty big down-side for wallets that provide a language setting. If the user changes their language setting, for example, then they cannot retrieve and enter their wallet words in their new language.

@Bitcoin I don't mean literally translate the same words to produce the same wallet. I mean if you have 12 English words as your wallet words, there should be 12 Spanish words that would produce that same wallet as well, but those 12 words are not language translations of each other at all, they are simply derived from encoding the same binary with dictionaries in different languages. — morsecoder, Aug 11 '15 at 10:51
Right, sorry I wasn't thinking with that comment and got completely the wrong end of the stick. I don't know why that design decision would have been made, a number of things BIP39 design seems to be a little off base. There's a lot of discussion about the shortcomings on github, though ultimately without much in the way of resolution. — Claris, Aug 11 '15 at 11:16
@Bitcoin great link, thx. Are they imploring Slush to implement an Electrum 2.x algorithm which uses a checksum? I'm checking my understanding is correct\ — Wizard Of Ozzie, Aug 14 '15 at 07:43
I'm not sure of the timeline. I got the impression that Electrum has just deviated and BIP39 isn't going to be changed to fit. — Claris, Aug 14 '15 at 10:53

In the BIP39 specification, are seeds generated from encoded or decoded passphrase?

0 Answers0