[cfe-dev] RFC: change string hash function in libc++

Howard Hinnant hhinnant at apple.com
Sun Dec 4 18:43:21 PST 2011


On Dec 4, 2011, at 9:17 PM, Chandler Carruth wrote:

> On Sun, Dec 4, 2011 at 4:12 PM, Howard Hinnant <hhinnant at apple.com> wrote:
> 
> I've been convinced to use murmur2 when I need to combine 2, 3, or 4 size_t's into a single one, except for the case of long double on x86.  My rationale is that long double patterns are not likely to accidentally occur often in a long double.  So I'm sticking with xor for long double for now.  If I see cases that could easily occur accidentally, this can always be changed in the future.
> 
> I settled on murmur2 instead of murmur3 because of the easy availability of the 32 bit and 64 bit versions.
> 
> I'm confused. What do you mean by 32-bit and 64-bit versions? If you could clarify what versions you would like, the author of the murmur hashes and some other folks here at google are offering to help build variants that target exactly what libc++ needs.
>  
>  I did not see a 64 bit murmur3, and figured murmur2 is good enough.  murmur2 won't actually ever be used when size_t is 64 bits.  It will only be activated for (unsigned) long long and double on 32 bit platforms.
> 
> Oh, except that I activated murmur2 for basic_string as well.
> 
> Why not murmur3 or city hash? murmur3 is both higher quality and significantly faster. city hash is still faster. For reference, the original thread was *primarily* focused on basic_string hashing.
> 
> The tests I ran weren't that convincing.  However I also admit that there is a lot I don't know here, and people are asking for murmur.  And what I had was operating a character at a time and murmur operates a word at a time.   So done.
> 
> At Google, we have found CityHash64 to provide very significant benefits when used to hash strings for unordered containers. If you look at the benchmark we posted (SMHasher) it was written specifically to showcase the problems hit by real-world hashing of string contents.

The libc++ code is publicly available at http://llvm.org/svn/llvm-project/libcxx/trunk/ .  Patches to this code are reviewed regularly by myself and others, and welcomed.  Please include a patch for http://llvm.org/svn/llvm-project/libcxx/trunk/CREDITS.TXT if you contribute.

Howard




More information about the cfe-dev mailing list