[LLVMdev] FoldingSet #collisions comparison

Wed Feb 10 16:49:57 PST 2010

On Mon, Feb 08, 2010 at 10:31:23AM -0800, Chris Lattner wrote:
> On Feb 7, 2010, at 1:03 PM, Gregory Petrosyan wrote:
> 
> >On Sat, Feb 06, 2010 at 04:51:15PM -0800, Chandler Carruth wrote:
> >>While I've not reviewed the patch in too much detail, it looks
> >>promising. Can you run some end-to-end benchmarks to make sure that
> >>cache pressure in the full program or other variables not accounted
> >>for in a micro-benchmark don't dominate performance? Specifically the
> >>nightly tester includes a number of real programs and machinery to
> >>measure total compile time.
> >
> >Ok, now with some kinda-hard numbers!
> 
> Hi Gregory,
> 
> These numbers are so noisy, that they aren't particularly useful.
> Could you try instrumenting foldingset to keep track track of the #
> collisions and # hash table resizes and compare those?  They should
> be much more stable and still correlate directly to performance.

OK, now with real numbers :-)

First, the main thing: SuperFastHash appears to be the hash with best
distribution. Use of MurmurHash instead generates 1.28% more collisions while
doing nightly test in MultiSource/Applications.

Second: I've also tested lookup3 hash, and its use generated 0.1% more
collisions, compared to SFH.

These results were a bit surprising for me!

Number of hash table resizes is independent of used hashing algorithm, because
hash table grows when 'nentries > nbuckets * 2'.

	Gregory