8

I would like to know whether it is possible to determine whether a value of for example 256 bit length is a SHA-256 hash or a random, equally distributed value. Is there any research related to common hash functions? Can this property be derived from the avalanche effect?

I would like to know because I would like to include plain text hash values in a steganographic medium, where the stegosystem bases on white noise to be turned into sentences, with number of words per sentences, that have the same distribution of length as usual in this language. Encryption of the hash values is not possible, because at that point of the designed protocol, no key exchange has been performed, yet.

Thank you for all answers or questions related to this problem.

jeteon
  • 143
  • 4

2 Answers2

18

A SHA-256 hash is, until broken by cryptanalysis, indistinguishable from 256 bits of random noise. The only way to defeat this is by enumerating inputs until you find a matching hash.

If there isn't much entropy in the input (e.g., it's an English word, or it's a value that repeats), it will be relatively simple for an attacker to distinguish it from random noise and possibly even determine the data that was originally hashed.

Stephen Touset
  • 11,002
  • 1
  • 38
  • 53
  • 1
    Thank you for fast answer! Rated positive, but don't have enough reputations yet. – Julian Bruckner Feb 26 '17 at 23:07
  • 3
    I have to point out that by enumerating inputs you are guarantee to find that hash, whatever the hash value is – Gianluca Ghettini Feb 27 '17 at 09:04
  • 1
    @GianlucaGhettini I think it's important to note that it might take a while to enumerate all possible inputs. And that the value you find might not be the original value that was put in - at the very least, if you hash 2^256 values producing distinct outputs, the (2^256)+1th value will be a collision. – Matthew Feb 27 '17 at 12:41
  • @GianlucaGhettini True but to add on to Matthew's comment on average it will take you on the order of 5 years of the sun's total energy output to succeed at a brute force attack on a random 256 bit hash. So unless you are part of at least a Kardashev type II civilization you probably don't want to get your hopes up. – DRF Feb 27 '17 at 13:55
  • @DRF that's exactly what I wanted to point out: enumerating the inputs doesn't give any clue about the 256 bit string. It could be the result of an hash or it could be the output of a PRNG ;-) It could even be the password of my password manager app – Gianluca Ghettini Feb 27 '17 at 14:10
  • 1
    @GianlucaGhettini I think it is not necessarily guaranteed that each 265-bit value actually has a preimage ... it could be that some are never hit. (Related: http://crypto.stackexchange.com/q/301/58) – Paŭlo Ebermann Feb 27 '17 at 14:12
  • @PaŭloEbermann True! Another good reason to not enumerate "all" the inputs in order to find out the origin of that 256 bit string :-) – Gianluca Ghettini Feb 27 '17 at 14:17
0

The point of a hash is that the slightest change in input has a cascading change on the resulting hash. With a hashing algorithm like SHA-256, it is designed to behave like a random oracle, which should provide a high amount of entropy, simliar to that of randomly generated numbers.

If you look at some of the applications of SHA256 today, they depend on the output values being randomly distributed.

Nik Roby
  • 166
  • 4