[PATCH] D40683: Make .gnu.hash section smaller.
Rafael Avila de Espindola via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 30 15:48:51 PST 2017
Can we add a test showing the number of buckets that is selected?
Rui Ueyama via Phabricator <reviews at reviews.llvm.org> writes:
> -// Returns a number of hash buckets to accomodate given number of elements.
> -// We want to choose a moderate number that is not too small (which
> -// causes too many hash collisions) and not too large (which wastes
> -// disk space.)
> -//
> -// We return a prime number because it (is believed to) achieve good
> -// hash distribution.
> -static size_t getBucketSize(size_t NumSymbols) {
> - // List of largest prime numbers that are not greater than 2^n + 1.
> - for (size_t N : {131071, 65521, 32749, 16381, 8191, 4093, 2039, 1021, 509,
> - 251, 127, 61, 31, 13, 7, 3, 1})
> - if (N <= NumSymbols)
> - return N;
> - return 0;
> -}
It turns out that using a prime number was not worth it?
> // Add symbols to this symbol hash table. Note that this function
> // destructively sort a given vector -- which is needed because
> // GNU-style hash table places some sorting requirements.
> @@ -1813,7 +1797,12 @@
> Symbols.push_back({B, Ent.StrTabOffset, hashGnu(B->getName())});
> }
>
> - NBuckets = getBucketSize(Symbols.size());
> + // We chose load factor 4 for the on-disk hash table. For each hash
> + // collision, the dynamic linker will compare a uint32_t hash value.
> + // Since the integer comparison is quite fast, I believe we can make
> + // the load factor even larger. 4 is just a conservative choice.
> + NBuckets = std::max<size_t>(Symbols.size() / 4, 1);
Make the comment impersonal.
LGTM with that.
We can make the hashes bigger again (or use prime numbers) if we get a
benchmark showing a problem. For now I think we should take the space
saving.
Cheers,
Rafael
More information about the llvm-commits
mailing list