[llvm-dev] RFC: Improving performance of HashString

Davide Italiano via llvm-dev llvm-dev at lists.llvm.org
Tue Apr 25 14:47:37 PDT 2017


On Tue, Apr 25, 2017 at 1:19 PM, Scott Smith
<scott.smith at purestorage.com> wrote:
> On Tue, Apr 25, 2017 at 1:11 PM, Davide Italiano <davide at freebsd.org> wrote:
>>
>> On Tue, Apr 25, 2017 at 12:55 PM, Vedant Kumar via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>> >
>> >> On Apr 24, 2017, at 5:37 PM, Scott Smith via llvm-dev
>> >> <llvm-dev at lists.llvm.org> wrote:
>> >>
>> >> I've been working on improving the startup performance of lldb, and ran
>> >> into an issue with llvm::HashString.  It works a character at a time, which
>> >> creates a long dependency chain in the processor.  On the other hand, the
>> >> function is very short, which probably works well for short identifiers.
>> >>
>> >> I don't know how the mix of identifier length seen by lldb compares
>> >> with that seen by llvm/clang; I imagine they're pretty similar.
>> >>
>> >> I have to different proposals, and wanted to gauge which would be
>> >> preferred:
>> >
>> >
>> >> 1. Use xxhash instead.
>> >
>>
>> I'd lean towards this option as we already have a copy of xxHash in
>> lib/Support (llvm/Support/xxhash.h).
>> We use it for computing the --build-id in lld as fast/non-crytographic
>> hash variant and has proven to outperform other candidates (e.g.
>> CityHash).
>
>
> That's actually why I suggested it, though AFAIK xxHash has more collisions
> than other hash algorithms.
>
> xxHash outperforms CityHash?  https://github.com/rurban/smhasher/  So many
> choices!
>

IIRC, yes. Of course, your mileage may vary but given we have a solid
alternative in-tree already I wouldn't go for another non-crypto hash
function unless there's a good reason for it.

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare


More information about the llvm-dev mailing list