0

So I was supposed to use hash some co-ordinates and I was supposed to use k 3-wise independent hash functions in order for the mathematical explanations behind that algorithm to be hold. But after using k 3 wise independent hash functions, I got kinda bad accuracy(I used tabulation hashing which I myself wrote for this. Then I decided to use some cryptographic hash function for collision resistance and ended up using sha256 in this way

  • As I have to generate k different hash functions for my purpose, I generated k different seeds and wrote this snippet -
hex_num = hashlib.sha256((str(self.seed[i])+str(d)).encode('utf-8')).hexdigest()
                int_hex = int(hex_num[:5],16)
                index = int_hex % self.width

where I want to get the hashed values in the range of 0 to self.width-1. Now I had a few questions-

  • 1
    Do you really need hash instead of use Encryption so that no two different plaintexts can map to the same ciphertext? The collision probability is given by the birthday-paradox. For two-item, it is almost zero. What is your total items, so that one can provide an almost exact number? If the result of the hash is not fitted then rejection should be applied inorder to eliminate the bias. – kelalaka Aug 10 '20 at 20:50
  • Also, what is self.width? – kelalaka Aug 10 '20 at 20:57
  • @kelalaka self.width is originally the width of a table, but you can actually assume that it is the range of output. – Subha Nawer Pushpita Aug 10 '20 at 23:52
  • Also @kelaka thanks so much, let me give you a bit detail.The original proofs of the thing I a, trying to do used k different 3-wise independent hash functions and for any hash function,if the range of the output values are [0,m-1], then as the outputs are supposed to be uniformly distributed, Pr(h(j))=some fixed value=1/m, was wondering what this would be for sha256 when used in the way mentioned in the question – Subha Nawer Pushpita Aug 11 '20 at 00:00
  • 1
    You can edit your question to clarify more. As it stands the: Birthday problem for cryptographic hashing, 101 answers your question. Just plug the numbers. Can we call it dupe? – kelalaka Aug 11 '20 at 09:32
  • @kelalaka I was also thinking how to work with the fact that I have just been using the first five digits of the output value of sha256? – Subha Nawer Pushpita Aug 13 '20 at 16:19

0 Answers0