I know that some hashes, like MD5 or SHA-1, that were previously thought to be safe are now known to be vulnerable to collision attacks. But it is obvious that collisions exist for all hashes, given that the space of possible hashes is smaller than the space of possible contents. For example, if one considers all possible files whose size is smaller or equal to the hash size, there must be some collisions.
However, I wonder if I can be sure that hashes will be different for “small enough” differences in contents. For example, for a given hash, can I assume that:
- All contents whose size is = the hash size will have different hashes (so that if $H(m_1) ≠ H(m_2)$ then $H(H(m_1)) ≠ H(H(m_2))$)?
- All contents smaller that $m$ bits/bytes will have different hashes?
- All contents that differ by less than $m$ bits/bytes will have different hashes?
- All contents that differ by less than $m$ consecutive bits/bytes will have different values?
- Inserting less that $m$ bits/bytes within a content will change its hash?
- Inserting less that $m$ bits/bytes at the end/beginning of a content will change its hash?
- Anything else?
If there are such assumptions that are true, do they survive the hash being truncated?
I guess answers to these questions are very dependent with the chosen hash functions. I’m very interested by answers about hashes of the SHA-2 and SHA-3 families, but answers about other hash functions (even MD5 and SHA-1) are welcome as well.