[llvm-dev] Default hashing function for integers (DenseMapInfo.h)

Francois Fayard via llvm-dev llvm-dev at lists.llvm.org
Tue Jan 10 01:20:55 PST 2017


> On Jan 10, 2017, at 9:36 AM, Bruce Hoult <bruce at hoult.org> wrote:

Thanks for your thoughts.

> Both are not very sophisticated.
> You should also look at the different MurmurHash versions, and descendants such as CityHash.

I’ll give a try at those MurmurHash functions.

> I've successfully used simply k (i.e. c =1) for both integers and pointers in hash tables with prime number sizes. It works fine, and they usually have non-overlapping ranges in practice. Hashing structures and strings is the more interesting challenge.

Using hash(k) = k when the size of the table is a prime number is OK. The bad news come when you start using 2^k as the size of your hash table and the keys are of the form 2^a k0 + k1.

> But if you're using power of two (or at least non-prime) table sizes then you should munch up the bits a lot more with multipliers with a lot more bits, not just something like 37.
> It may be that these hash tables don't have many entries and don't have many collisions and 37 is fine, and a more expensive hash function would slow it down. Only testing can tell you.

I’ll first try to make things worse such as using hash(k) = k for pointers which should be really bad. See how much difference it does make. 

François Fayard


More information about the llvm-dev mailing list