[cfe-dev] RFC: change string hash function in libc++

Jeffrey Yasskin jyasskin at googlers.com
Fri Dec 2 16:00:00 PST 2011


On Fri, Dec 2, 2011 at 3:52 PM, Craig Silverstein <csilvers at google.com> wrote:
> } What are the numbers behind these "performs better"?
>
> Austin Appleby (the murmurhash author) agreed to run some numbers.
> Here's his report.  The executive summary is that city is *much*
> faster for strings >512 bytes, and murmur is a bit faster for strings
> <512 bytes.
>
> --- Testing City128

Sorry to nitpick, but http://code.google.com/p/cityhash/ says "we use
variants of CityHash64() mainly in hash tables such as
hash_map<string, int>. ... CityHash128() and similar return a 128-bit
hash and are tuned for strings of at least a few hundred bytes.
Depending on your compiler and hardware, it may be faster than
CityHash64() on sufficiently long strings. It is known to be slower
than necessary on shorter strings, but we expect that case to be
relatively unimportant. Inside Google we use variants of CityHash128()
mainly for code that wants to minimize collisions."

So it's probably better to test City64 than City128 for a hash-table proposal.




More information about the cfe-dev mailing list