[cfe-dev] RFC: change string hash function in libc++

Austin Appleby aappleby at google.com
Fri Dec 2 18:56:42 PST 2011


Hi, Craig Silverstein asked me to chime in here regarding MurmurHash
and CityHash.

Regarding code size impact, here's what I dug up using "nm -aCS
--size-sort --radix=d smhasher | grep -i -e CityHash -e MurmurHash3":

0000000004388176 0000000000000023 T CityHash64WithSeed(char const*,
unsigned long, unsigned long)
0000000004389488 0000000000000024 T CityHash64_test(void const*, int,
unsigned int, void*)
0000000004389520 0000000000000038 T CityHash128_test(void const*, int,
unsigned int, void*)
0000000004388096 0000000000000071 T CityHash64WithSeeds(char const*,
unsigned long, unsigned long, unsigned long)
0000000004390528 0000000000000229 T MurmurHash3_x86_32(void const*,
int, unsigned int, void*)
0000000004390768 0000000000000414 T MurmurHash3_x86_64(void const*,
int, unsigned int, void*)
0000000004391968 0000000000000620 T MurmurHash3_x64_128(void const*,
int, unsigned int, void*)
0000000004391184 0000000000000770 T MurmurHash3_x86_128(void const*,
int, unsigned int, void*)
0000000004387136 0000000000000958 T CityHash64(char const*, unsigned long)
0000000004388208 0000000000001270 T CityHash128WithSeed(char const*,
unsigned long, std::pair<unsigned long, unsigned long>)

The *_test functions are just adapters for SMHasher's API and can be ignored.

One additional item I wanted to mention was that CityHash64,
CityHash128, and MurmurHash3_x64_128 assume that 64-bit arithmetic is
fast, which may not be the case on all platforms. MurmurHash3_x86_128
uses only 32-bit arithmetic and may be a better fit if performance on
both 32- and 64-bit platforms is required. If a 128-bit hash result is
overkill, I can resurrect MurmurHash3_x86_64, which I didn't publish
with the rest of the MurmurHash3 suite as it didn't seem to fit any
reasonable use cases at the time.

-Austin



More information about the cfe-dev mailing list