8

I'm wondering, if something like block cipher with big block size is a good memory-hard function?

All memory-hard key derivation functions I've seen look more complex than that, which made me question my understanding of memory-hardness.

The following function looks memory-hard (memory tradeoff looks expensive) to me:

size = 16MB / (F() word size in bytes)
rounds = 1024
state = constants ⊕ (key || salt)

for (i = 0; i < rounds * size; i++)
   state[i mod size] = state[i mod size] ⊕ F(state[i + 1 mod size])

result = state ⊕ key

How memory-hard is this type-I generalized Feistel network key derivation function?

Grant Miller
  • 183
  • 2
  • 7
  • 17
LightBit
  • 1,649
  • 13
  • 27

1 Answers1

4

I've been toying around with your function, and I've come to the conclusion it's not memory hard. The amount of required memory can be reduced to at maximum digestsize * 3 * rounds.

The first problem is that the entropy does not avalanche throughout the state, but stays localized. For example, after 1 round the state of the 2nd block only depends on the state of the 3rd block. After 2 rounds it (indirectly) only depends on the initial state of the 3rd and the 4th block. This means that an attacker could precompute 1024 rounds on a large portion of the constant and not need per-password memory for that.

A possible improvement would be to turn around the mixing to state[i mod size] = state[i mod size] ⊕ F(state[i - 1 mod size]) so that each block depends on the previous instead of the next, but the fact that you're generating & hashing in the same order will always make tradeoffs possible. Changing that order, preferably in a data-indepent way (e.g. by reversing the order of the blocks after each round), would make that way more difficult.

To illustrate my point, here are 2 codes: 1 with, and 1 without a memory tradeoff on your function:

#!/usr/bin/env python3
from hashlib import sha512

blocks = 1000
rounds = 100
# pseudorandom constants created by hashing the literals "0", "1", ... "1000"
constants = [sha512(str(i).encode()).digest() for i in range(0, blocks)]


def mix(a, b):
    global operations
    operations += 1  # statistics
    b = sha512(b).digest()
    return bytearray(a ^ b for (a, b) in zip(a, b))


def more_memory(constant):
    state = list(constant)

    for i in range(0, rounds * blocks):
        state[i % blocks] = mix(state[i % blocks], state[(i + 1) % blocks])

    # Make a checksum for the entire state
    state_hash = sha512()
    for s in state:
        state_hash.update(s)

    return state_hash.hexdigest()[:32]


def less_memory(constant):
    def perform_round(sequence):
        first = None
        cur = next(sequence)

        try:
            while True:
                after = next(sequence)
                result = mix(cur, after)
                yield result
                if not first:
                    first = result
                cur = after
        except StopIteration:
            pass

        yield mix(cur, first)

    current = (s for s in constant)
    for _ in range(0, rounds):
        current = perform_round(current)

    # Make a checksum for the entire state
    state_hash = sha512()
    for s in current:
        state_hash.update(s)

    return state_hash.hexdigest()[:32]

operations = 0
print('Using a lot of memory: {} ({} mixing-operations)'.format(more_memory(constants), operations))
operations = 0
print('Using little memory: {} ({} mixing-operations)'.format(less_memory(constants), operations))

# Prints:
# Using a lot of memory: e58f5665a2c388fc1420d4d1667f83df (100000 mixing-operations)
# Using little memory: e58f5665a2c388fc1420d4d1667f83df (100000 mixing-operations)
Daan Bakker
  • 500
  • 2
  • 10