[llvm] [ADT] Update hash function of uint64_t for DenseMap (PR #95734)
Fangrui Song via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 19 12:40:26 PDT 2024
MaskRay wrote:
Thanks for these pointers!
`DenseMap` extracts low bits from a 32-bit `getHashValue`.
(It would probably be nice to switch to a 64-bit hash, perhaps with a new member function.)
This limits the effectiveness of pure multiplicative hashing. We need one xorshift step, which is done in #95970.
A lot of work can be done to both `Hashing.h` and `DenseMap`.
For example, we could still do a better job at discouraging reliance on the iteration order of DenseMap.
While LLVM_ENABLE_REVERSE_ITERATION helps, I had to fix 3 uses cases in llvm/ and clang/ to change `getHashValue` for std::pair.
Incorporating `Hashing.h` into `DenseMap` and switching to `hash_value(42)` or `hash_combine(42, 43)` would mix bits in a better way, but increase the code size and cause some slowdown without clear benefits.
---
I have read some code of carbon-lang/common/hashing.h and absl/hash for integer types and std::pair.
For integer types <= 8 bytes,
* `llvm/include/llvm/ADT/Hashing.h` uses a mxmxm variant `hash_16_bytes` (Murmur-inspired) that has larger latency and probably better avalanche behavior (though likely unnecessarily "strong").
* absl uses a multiply-xorshift `Mix` and uint128 on 64-bit pointer machines.
* Carbon uses a multiply-bswap `WeakMix` using `unsigned _BitInt(128)`.
---
Waiting for Hashing.h and DenseMap improvement would take too long.
To address the immediate needs, **this patch might leverage `densemap::detail::mix` for DenseMapInfo `unsigned long` and `unsigned long long` specializations. @ChuanqiXu9
---
When we are ready to switch more stuff to Carbon style hashing, we can probably use the following multiplication fallback for non-GCC-non-Clang compilers.
```cpp
std::pair<uint64_t, uint64_t> mul64(uint64_t a, uint64_t b) {
uint64_t a0 = a & 0xffffffff, a1 = a >> 32;
uint64_t b0 = b & 0xffffffff, b1 = b >> 32;
uint64_t t = a0 * b0;
uint64_t u = t & 0xffffffff;
t = a1 * b0 + (t >> 32);
uint64_t v = t >> 32;
t = (a0 * b1) + (t & 0xffffffff);
return {(t << 32) + u, a1 * b1 + v + (t >> 32)};
}
```
https://github.com/llvm/llvm-project/pull/95734
More information about the llvm-commits
mailing list