<div dir="ltr"><div>Oh, and another option is to pass the hash value in as a parameter to insert/find/etc.  This would work particularly well for lldb, since it's ConstString already computes the hash to determine which lock/sub hash table to use; if it computed a single 64-bit hash, and used the upper bits to index the lock, and then passed the lower 32-bits to the hashtable itself, then it'd be another performance boost.<br><br></div>Hmmm while I'm at it, I can parameterize the type of the hash value itself; 32-bits isn't great when you have 2M+ symbols.  Ok I'll go prototype this and see what kind of benefit I can get.<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Apr 28, 2017 at 9:21 PM, Scott Smith <span dir="ltr"><<a href="mailto:scott.smith@purestorage.com" target="_blank">scott.smith@purestorage.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div>I wonder if a less invasive change would be to simply make StringMap take a hash function as a template parameter, so that different hash functions could be plugged in for different uses.<br><br></div>Clang is more likely to have shorter symbols (all my loop variables are called "i" ;-), while lldb has longer symbols (my_outer_namespace::my_inner_<wbr>namespace::my_class<template_<wbr>param_1, template param_2, template_param_3<has_yet_<wbr>another_template_param, or_two>>::......), so each has a different set of tradeoffs.<br><br></div>Thoughts?<br><br></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Apr 28, 2017 at 3:51 PM, Bruce Hoult <span dir="ltr"><<a href="mailto:bruce@hoult.org" target="_blank">bruce@hoult.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>According to...</div><div><br></div><a href="https://github.com/rurban/smhasher/blob/master/README.md" target="_blank">https://github.com/rurban/smha<wbr>sher/blob/master/README.md</a><br><div><br></div><div>Bernstein has quality problems (while xx is as good as you get in a non-crypto hash), and xx is 7x (32 bit) - 12x (64 bit) faster.</div><div><br></div><div>That's on long strings. It would be worth checking the startup overhead for typically short identifiers in programs.</div><div><br></div><div>See later on in the README:</div><div><br></div><div><div>"When used in a hash table the instruction cache will usually beat the CPU and throughput measured here. In my tests the smallest FNV1A beats the fastest crc32_hw1 with Perl 5 hash tables. Even if those worse hash functions will lead to more collisions, the overall speed advantage beats the slightly worse quality. See e.g. A Seven-Dimensional Analysis of Hashing Methods and its Implications on Query Processing for a concise overview of the best hash table strategies, confirming that the simpliest Mult hashing (bernstein, FNV*, x17, sdbm) always beat "better" hash functions (Tabulation, Murmur, Farm, ...) when used in a hash table.</div><div><br></div><div>The fast hash functions tested here are recommendable as fast for file digests and maybe bigger databases, but not for 32bit hash tables."</div></div><div><br></div><div><br></div></div><div class="m_-2606508716395148997HOEnZb"><div class="m_-2606508716395148997h5"><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Apr 29, 2017 at 12:57 AM, Sean Silva via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">IIRC when I talked with Chandler (who has a lot of background in hashing), the Bernstein hash function actually ends up being pretty good (as a hash function, not necessarily performance) for C/C++ identifier-like strings (despite not being that good for other types of strings), so we'll want to verify that we don't regress hash quality (which affects efficiency of the hash tables). In particular, IIRC this is the function used for Clang's identifier maps (via StringMap) and so I'd like to see some measurements that ensure that these performance improvements translate over to Clang (or at least don't regress).<div><br></div><div>If Clang doesn't regress and xxHash is measurably better for other HashString workloads, then I don't see a reason not to switch to it.<div><br></div><div>-- Sean Silva</div></div></div><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="m_-2606508716395148997m_1266490902911713394h5">On Mon, Apr 24, 2017 at 5:37 PM, Scott Smith via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="m_-2606508716395148997m_1266490902911713394h5"><div dir="ltr"><div><div><div><div><div><div><div>I've been working on improving the startup performance of lldb, and ran into an issue with llvm::HashString.  It works a character at a time, which creates a long dependency chain in the processor.  On the other hand, the function is very short, which probably works well for short identifiers.<br><br></div>I don't know how the mix of identifier length seen by lldb compares with that seen by llvm/clang; I imagine they're pretty similar.<br><br></div>I have to different proposals, and wanted to gauge which would be preferred:<br><br></div><div>1. Use xxhash instead.<br></div><div><br></div>2. Use the Intel native crc32q instruction to process 8 bytes at a time, then fall back to byte at a time.  Non sse 4.2 capable processors (either early or non Intel/AMD x86) would use the existing algorithm, or possibly #1 above.<br><br></div>For my test, both result in approximately the same # of cycles (within 0.5%).<br><br>#1 uses 3+% more instructions.<br></div>#2 requires (realistically) runtime detection of cpu capabilities, because distributions would want generic x86/x86_64 compatible binaries, not separate binaries per cpu feature.<br><br></div>I'm leaning toward #1 despite the instruction increase.  My only worry is the effect on compile times for code with lots of short identifiers.  I haven't tested that though (and I don't have a suitable benchmark suite for that) so for all I know I'm worrying about nothing.<br><br></div>FYI the improvement is approximately 11.5% reduction in cycles for my lldb test (b main, run, quit), so IMO it's pretty significant.<br><br></div>

<br></div></div><span>______________________________<wbr>_________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

<br></span></blockquote></div><br></div>

<br>______________________________<wbr>_________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

<br></blockquote></div><br></div>

</div></div></blockquote></div><br></div>

</div></div></blockquote></div><br></div>