[llvm-dev] RFC: Improving performance of HashString
Sean Silva via llvm-dev
llvm-dev at lists.llvm.org
Fri Apr 28 14:57:44 PDT 2017
IIRC when I talked with Chandler (who has a lot of background in hashing),
the Bernstein hash function actually ends up being pretty good (as a hash
function, not necessarily performance) for C/C++ identifier-like strings
(despite not being that good for other types of strings), so we'll want to
verify that we don't regress hash quality (which affects efficiency of the
hash tables). In particular, IIRC this is the function used for Clang's
identifier maps (via StringMap) and so I'd like to see some measurements
that ensure that these performance improvements translate over to Clang (or
at least don't regress).
If Clang doesn't regress and xxHash is measurably better for other
HashString workloads, then I don't see a reason not to switch to it.
-- Sean Silva
On Mon, Apr 24, 2017 at 5:37 PM, Scott Smith via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I've been working on improving the startup performance of lldb, and ran
> into an issue with llvm::HashString. It works a character at a time, which
> creates a long dependency chain in the processor. On the other hand, the
> function is very short, which probably works well for short identifiers.
>
> I don't know how the mix of identifier length seen by lldb compares with
> that seen by llvm/clang; I imagine they're pretty similar.
>
> I have to different proposals, and wanted to gauge which would be
> preferred:
>
> 1. Use xxhash instead.
>
> 2. Use the Intel native crc32q instruction to process 8 bytes at a time,
> then fall back to byte at a time. Non sse 4.2 capable processors (either
> early or non Intel/AMD x86) would use the existing algorithm, or possibly
> #1 above.
>
> For my test, both result in approximately the same # of cycles (within
> 0.5%).
>
> #1 uses 3+% more instructions.
> #2 requires (realistically) runtime detection of cpu capabilities, because
> distributions would want generic x86/x86_64 compatible binaries, not
> separate binaries per cpu feature.
>
> I'm leaning toward #1 despite the instruction increase. My only worry is
> the effect on compile times for code with lots of short identifiers. I
> haven't tested that though (and I don't have a suitable benchmark suite for
> that) so for all I know I'm worrying about nothing.
>
> FYI the improvement is approximately 11.5% reduction in cycles for my lldb
> test (b main, run, quit), so IMO it's pretty significant.
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170428/239818ec/attachment-0001.html>
More information about the llvm-dev
mailing list