Hash based code obfuscation?

Question

The code

if(str1 == "abc") {}

can be converted to

if(hash(str1) == 0x8732e) {} // assume hash("abc") == 0x8732e

to obfuscate the code.

But they are not equivalent when hash collision occurs; e.g., another string xyz's hash value is the same as abc.

That is true theoretically. But is it an issue in real-life when the method above is used to obfuscate code? Is it a well accepted obfuscation method?

Edit: after a little research, I guess what I need is a one-way collision-free string transformation function, as I don't care the length of the output. Any suggestions? Any mature implementations?

Also posted on Security.SE. Please do not post the same question on multiple sites. Each community should have an honest shot at answering without anybody's time being wasted. — D.W., Aug 08 '16 at 16:32
Don't worry about collisions, if your hash value is long enough. If you use a cryptographic hash (full length), finding a collision or a second preimage is impossible for all practical purposes. — tylo, Sep 06 '16 at 16:14

score 6 · Accepted Answer · edited Apr 13 '17 at 12:48

If the inputs you're hashing aren't long enough with enough entropy, the hashes can be inverted by an adversary who is willing to brute force the entire range of possible inputs.

For example, hashing abc provides little barrier to those who are dedicated at deobfuscated your program, as it takes almost no time to calculate hashes for all possible 3 character strings.

The adversary can build a table of all inputs up to a certain size and proceed to crack your obfuscated data in negligible time. If your adversary has financial motivation, this probably will not protect your code.

You could opt for additional protection by using slow, salted hashes, which are usually employed to protect passwords. Using a salt can prevent your adversary from precomputing the table before he has access to your source code. The downside is that using a slow hash in an effective manner will cause a non negligible performance hit to your application.

Phill Somerville · Answer 2 · 2016-09-06T11:26:15.400

0

To avoid the table attack example Ella gave above^, you would need a hash function which is also a CSPRNG, in order to 'pad' the string. In this manner, an invertible bijection may be constructed where no two hashings are associated with the same string ie; a one way trapdoor function.

edited Sep 06 '16 at 11:26

answered Sep 05 '16 at 23:20

Phill Somerville

19
2

It'd be nice if you listed/linked which functions qualify as CSPRNGs. I read that one of the older SHA algo's qualifies as a CSPRNG. I tentatively assume that newer members of the SHA family will have the same properties, even though the underlying algos themselves have little in common. Specifically, does SHA3 qualify as a CSPRNG? Generally, how safe is it to assume that it's safe to drop a new-and-improved family member (SHA2,SHA3) in place of an older one (SHA1,SHA2)? – Mark Jun 07 '17 at 14:48

Hash based code obfuscation?

2 Answers2