[llvm-dev] RFC: Improving performance of HashString

Scott Smith via llvm-dev llvm-dev at lists.llvm.org
Fri Apr 28 21:21:19 PDT 2017

I wonder if a less invasive change would be to simply make StringMap take a
hash function as a template parameter, so that different hash functions
could be plugged in for different uses.

Clang is more likely to have shorter symbols (all my loop variables are
called "i" ;-), while lldb has longer symbols
template param_2, template_param_3<has_yet_another_template_param,
or_two>>::......), so each has a different set of tradeoffs.


On Fri, Apr 28, 2017 at 3:51 PM, Bruce Hoult <bruce at hoult.org> wrote:

> According to...
> https://github.com/rurban/smhasher/blob/master/README.md
> Bernstein has quality problems (while xx is as good as you get in a
> non-crypto hash), and xx is 7x (32 bit) - 12x (64 bit) faster.
> That's on long strings. It would be worth checking the startup overhead
> for typically short identifiers in programs.
> See later on in the README:
> "When used in a hash table the instruction cache will usually beat the CPU
> and throughput measured here. In my tests the smallest FNV1A beats the
> fastest crc32_hw1 with Perl 5 hash tables. Even if those worse hash
> functions will lead to more collisions, the overall speed advantage beats
> the slightly worse quality. See e.g. A Seven-Dimensional Analysis of
> Hashing Methods and its Implications on Query Processing for a concise
> overview of the best hash table strategies, confirming that the simpliest
> Mult hashing (bernstein, FNV*, x17, sdbm) always beat "better" hash
> functions (Tabulation, Murmur, Farm, ...) when used in a hash table.
> The fast hash functions tested here are recommendable as fast for file
> digests and maybe bigger databases, but not for 32bit hash tables."
> On Sat, Apr 29, 2017 at 12:57 AM, Sean Silva via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>> IIRC when I talked with Chandler (who has a lot of background in
>> hashing), the Bernstein hash function actually ends up being pretty good
>> (as a hash function, not necessarily performance) for C/C++ identifier-like
>> strings (despite not being that good for other types of strings), so we'll
>> want to verify that we don't regress hash quality (which affects efficiency
>> of the hash tables). In particular, IIRC this is the function used for
>> Clang's identifier maps (via StringMap) and so I'd like to see some
>> measurements that ensure that these performance improvements translate over
>> to Clang (or at least don't regress).
>> If Clang doesn't regress and xxHash is measurably better for other
>> HashString workloads, then I don't see a reason not to switch to it.
>> -- Sean Silva
>> On Mon, Apr 24, 2017 at 5:37 PM, Scott Smith via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>> I've been working on improving the startup performance of lldb, and ran
>>> into an issue with llvm::HashString.  It works a character at a time, which
>>> creates a long dependency chain in the processor.  On the other hand, the
>>> function is very short, which probably works well for short identifiers.
>>> I don't know how the mix of identifier length seen by lldb compares with
>>> that seen by llvm/clang; I imagine they're pretty similar.
>>> I have to different proposals, and wanted to gauge which would be
>>> preferred:
>>> 1. Use xxhash instead.
>>> 2. Use the Intel native crc32q instruction to process 8 bytes at a time,
>>> then fall back to byte at a time.  Non sse 4.2 capable processors (either
>>> early or non Intel/AMD x86) would use the existing algorithm, or possibly
>>> #1 above.
>>> For my test, both result in approximately the same # of cycles (within
>>> 0.5%).
>>> #1 uses 3+% more instructions.
>>> #2 requires (realistically) runtime detection of cpu capabilities,
>>> because distributions would want generic x86/x86_64 compatible binaries,
>>> not separate binaries per cpu feature.
>>> I'm leaning toward #1 despite the instruction increase.  My only worry
>>> is the effect on compile times for code with lots of short identifiers.  I
>>> haven't tested that though (and I don't have a suitable benchmark suite for
>>> that) so for all I know I'm worrying about nothing.
>>> FYI the improvement is approximately 11.5% reduction in cycles for my
>>> lldb test (b main, run, quit), so IMO it's pretty significant.
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170428/2166c67f/attachment.html>

More information about the llvm-dev mailing list