[LLVMdev] We need better hashing

Chandler Carruth chandlerc at google.com
Fri Feb 17 01:59:20 PST 2012


On Tue, Feb 14, 2012 at 2:44 AM, Chris Lattner <clattner at apple.com> wrote:

> I'm contradicting my stance above about not caring about the
> implementation :), but is MurmurHash a good hash for string data?
>  The Bernstein hash function works really well and is much cheaper to
> compute than Murmur.  It is used by HashString (and thus by StringMap).


If you want a good string hashing function, CityHash is by a fair margin
the best one out there. Look at the comparison done by Craig, Howard, and
several others when discussing what hashing function to use for libc++.

The only downside to CityHash is code size. It is heavily tuned, and that
results in several special case routines to get maximal efficiency and hash
quality for short strings (yep, not just huge ones). That said, the code
size increase was measured carefully for libc++ and it's really quite small.

That said, I have no benchmarks showing this matters for our uses of
StringMap. It reduced collisions, it didn't show up as a hot function, but
the collisions and the hashing simply didn't dominate any profiles I looked
at....
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120217/5e13e313/attachment.html>


More information about the llvm-dev mailing list