[llvm-commits] [llvm] r154497 - in /llvm/trunk: include/llvm/Metadata.h lib/VMCore/LLVMContextImpl.h lib/VMCore/Metadata.cpp

Wed Apr 11 08:44:52 PDT 2012

On Wed, Apr 11, 2012 at 4:32 PM, Benjamin Kramer
<benny.kra at googlemail.com>wrote:

>
> On 11.04.2012, at 17:20, Jakob Stoklund Olesen wrote:
>
> >
> > On Apr 11, 2012, at 7:06 AM, Benjamin Kramer wrote:
> >
> >> Author: d0k
> >> Date: Wed Apr 11 09:06:54 2012
> >> New Revision: 154497
> >>
> >> URL: http://llvm.org/viewvc/llvm-project?rev=154497&view=rev
> >> Log:
> >> Cache the hash value of the operands in the MDNode.
> >>
> >> FoldingSet is implemented as a chained hash table. When there is a hash
> >> collision during insertion, which is common as we fill the table until a
> >> load factor of 2.0 is hit, we walk the chained elements, comparing every
> >> operand with the new element's operands. This can be very expensive if
> the
> >> MDNode has many operands.
> >>
> >> We sacrifice a word of space in MDNode to cache the full hash value,
> reducing
> >> compares on collision to a minimum. MDNode grows from 28 to 32 bytes +
> operands
> >> on x86. On x86_64 the new bits fit nicely into existing padding, not
> growing
> >> the struct at all.
> >>
> >> The actual speedup depends a lot on the test case and is typically
> between
> >> 1% and 2% for C++ code with clang -c -O0 -g.
> >
> > Neat!
> >
> > I would suggest one tweak:
> >
> > All the nodes in a hash table chain are going to have identical low bits
> in the hash value. If you compute a 64-bit hash value and store the high 32
> bits in the node while using the low bits to index the hash table, you can
> lower the probability of collisions even further.
>
> I thought about that, our hashing infrastructure computes a size_t hash so
> we could easily take the upper half of it on x86_64.
>
> OTOH we can't reuse the cached hash value when the FoldingSet grows if we
> store the upper bits, which would probably eat away any speedup from the
> saved collisions :/
>

Collisions across the entire 32-bit key are really quite rare. I wouldn't
stress about this. It is nice to avoid re-computing the hash as the set
grows.

I think it would be good to look into doing similar tricks for other large
foldingsets with large keys that would be slow to compare.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120411/dd19b211/attachment.html>