I've been working on cryptopals 1.6, breaking XOR, and have got it working. I found this answer that explains why hamming distance works, and after some consideration I believe I'm starting to understand it.
One thing that I stumbled on was key size estimation, step 3. I finally found a write-up:
fn score_keysize(keysize, bytes) {
let block1 = bytes.getblock(size=keysize)
let block2 = bytes.getblock(offset=keysize, size=keysize)
let block3 = bytes.getblock(offset=keysize*2, size=keysize)
let block4 = bytes.getblock(offset=keysize*3, size=keysize)
let hamming = hamming(a,b) + hamming(b,c) + hamming(c,d)
return hamming / 3
}
And this works. I tried a variant but didn't even get nearly as good results. Almost all of my attempts, with some variance, got a lower score the larger the key became, this included.
fn score_keysize(keysize, bytes) {
let block1 = bytes.getblock(size=keysize)
let block2 = bytes.getblock(offset=keysize, size=keysize)
let block3 = bytes.getblock(offset=keysize*2, size=keysize)
let block4 = bytes.getblock(offset=keysize*3, size=keysize)
let hamming = hamming(a,b) + hamming(c,d)
return hamming / 2
}
Could someone explain why that's so much worse? Or is it dependent on the text?