[llvm-dev] Default hashing function for integers (DenseMapInfo.h)

Francois Fayard via llvm-dev llvm-dev at lists.llvm.org
Mon Jan 9 02:10:05 PST 2017


I’ve been looking at Chandler Carruth talks about internal data structures for LLVM in order to implement my own library. That’s how I ended up looking at the internals of DenseMap and the default hash functions defined in DenseMapInfo.

I have been very surprised by the hash function used for integers which is hash(k) = 37 * k. This kind of hashing function does not change the lowest bits of k, and if your keys happen to all be a multiple of 8, you end up not using about 90% of the slot as a direct access in an open addressing hash table. By the way, does anyone know the rationale behind using 37?

In order to improve this, it could be nice to use a better hash function such as the Knuth hash which is defined as:

std::size_t hash_value(std::size_t val, int p) {
  assert(p >= 0 && p <= 64);

  const std::size_t knuth = 11133510565745311;
  return (val * knuth) >> (64 - p);

The big advantage of this function is that is shuffles all the bits and does not suffer from the problems given above. It is also quite cheap to compute. Unfortunately, one needs to change the signature of the hash_value function as the hash function in order to take an integer k and hash it into a integer in {0, 1, 2, …, 2^p} is no longer in the form hash(k) % 2^p. It needs to be in the form hash(k, p).

As I don’t know anything about the LLVM source code, here are a few questions:
- Is DenseMap used a lot with integers as a key in LLVM? Do you think it would benefit from a better default hash function?
- Is there any benchmark to run that makes a heavy use of those hash function so I can see if my changes would improve performance?

Best regards,

François Fayard
Founder & Consultant - Inside Loop
Applied Mathematics & High Performance Computing
Tel: +33 (0)6 01 44 06 93
Web: www.insideloop.io

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170109/499c4a39/attachment.html>

More information about the llvm-dev mailing list