Questions tagged [hash]

Mathematical function that maps arbitrarily-sized data to fixed-size integers, often used as keys in hash tables or to help ensure data integrity

A hash function maps data structures of any size to fixed-length integers, with the intent that different data map to different integers. Some important kinds of hash functions are

335 questions
9
votes
2 answers

Hash size: Are prime numbers "near" powers of two a poor choice for the modulus?

Cormen et al.'s "Introduction to Algorithms" says the following about the division method hash function $h(k)=k \text{ mod } m$: A prime not too close to an exact power of 2 is often a good choice for $m$. For example, suppose we wish to allocate a…
7
votes
1 answer

Hashing using Horner’s Rule

When hashing a (key, value) pair where the key is a string, I have seen the following hash function in use: E.g. $c_n + 256c_{n-1}+ 256^2c_{n-2}+...256^{n-1}c_1$, where this represents the string $c_1c_2..c_n$. I understand the equation above can be…
huy
  • 173
  • 4
4
votes
3 answers

Check-digit algorithm that includes characters?

I have a data set that uses a very simple modulo 10 checksum algorithm which ignores alphabetic characters entirely. Which wasn't a big deal, as the few alphabetic characters present weren't critical. Changes to the data format are upcoming,…
CoAstroGeek
  • 151
  • 1
  • 6
4
votes
2 answers

Understanding of hash tables

I am currently studying hash tables in an introductory course to computer science. I was taught that hash table is a data structure that associates a key to an index (a hash table) and then to the value associated to the key. I don't understand why…
3
votes
2 answers

Information on Tavori and Dreizin ranged hash function?

While doing some digging around in the GNU implementation of the C++ standard library I came across a section in bits/hashtabe.h that refers to a hash function "in the terminology of Tavori and Dreizin" (see below). I have tried without success to…
moarCoffee
  • 155
  • 3
3
votes
1 answer

Choosing a non-cryptographic hash function for language with no unsigned integers

I'm implementing a hash table in pure UnrealScript, which only has support for signed 32-bit integers. This means no 64-bit integers and no unsigned integers. I was in the middle of implementing an FNV hash function, but ran into a potential…
Colin Basnett
  • 217
  • 1
  • 6
3
votes
2 answers

Why does this particular hashCode function help decrease collisions?

I just read in Effective Java about the hashCode method: Store some constant nonzero value, say, 17, in an int variable called result. For each significant field f in your object (each field taken into account by the equals method, that is), do…
Maksim Dmitriev
  • 413
  • 1
  • 3
  • 14
2
votes
1 answer

Surprisingly high collision rates when hashing a short list with few buckets

I'm trying to help my daughter with her CS assignment on hashing. She has an input list of about 4000 English words, each 5 letters long. The prof has limited her to 4000 output buckets (digests? -- it's been a long time since I did this stuff). And…
SaganRitual
  • 155
  • 6
2
votes
1 answer

Universal family of hash functions

How to prove that a $k$-universal family of hash functions is $(k-1)$-universal family? I tried to prove it by definition of k-universal family of hash functions but I didn't know how to use the definition. If I prove it, Is it necessary that a…
atefsawaed
  • 165
  • 5
2
votes
2 answers

Simple pseudorandom split of data

I want to split my data into $n$ approximately equal parts. Which simple hash functions will ensure that the number of $x$ with $h(x)\equiv i\pmod{n}$ is approximately equal for each $i$?
2
votes
3 answers

Hash function floating point inputs for genetic algorithm

I am implementing a genetic algorithm to use as an optimisation algorithm to evolve robots. The robots have certain parameters (represented as floats) which can lie anywhere within a certain range defined for each parameter. My goal is to optimise…
texasflood
  • 143
  • 1
  • 4
2
votes
1 answer

Why having a simple multiplication loop and very good avalanche isn't enough to produce well-distributed hash values?

Modern non-cryptographic 32- and 64-bit valued hash functions, for example, lookup3, MurmurHash3 and CityHash, have quite sophisticated loops, each iteration of which include many multiplications, XORs and rotates. Why this is needed, since there…
leventov
  • 385
  • 1
  • 9
2
votes
1 answer

Sequential numbers to unique-looking numbers

I'm not sure how to word this because I'm not familiar with this, but I'm sure a process like this is rather common. Basically, I've got members signing up for our website, and each one is assigned a normal sequential ID (a MySQL auto_increment ID),…
M Miller
  • 155
  • 4
2
votes
1 answer

Proving calculating Minhash

I'm reading about MinHash technique to estimate the similarity between 2 sets: Given set A and B, h is the hash function and $h_\min(S)$ is the minimum hash of set S, i.e. $h_\min(S) = \min(h(s))$ for s in S. We have the equation: $$ p(h_\min(A) =…
Long Thai
  • 165
  • 4
2
votes
2 answers

Universal hashing function probability

Can somebody explain the following: (source: fbcdn.net) U is a universe of keys, and H is a finite collection of hash functions mapping U to {0, 1, … , m-1}. I do not understand definition 2, and thus why amount of funtions that map x and y to the…
coolchock
  • 23
  • 2
1
2 3