[LLVMdev] On LLD performance

Chandler Carruth chandlerc at google.com
Wed Mar 18 14:17:30 PDT 2015


On Wed, Mar 18, 2015 at 1:48 PM, Benjamin Kramer <benny.kra at gmail.com>
wrote:

> > FYI we have StringMap which is specialized for strings. Also I'm sort of
> > amazed that StringMap is using HashString (bernstein) instead of the
> fairly
> > sophisticated hash functionality we have in ADT/Hashing.h
>
> I tried using the new hashing back when Hashing.h was written. The
> result was that there were fewer collisions, but the hash function was
> significantly slower. Result was a net regression in runtime.


FWIW, I did a somewhat more deep analysis of this while I was writing the
new hashing stuff.

Fundamentally, StringMap is not a generic string map. It is an source code
identifier map.

Source code identifiers tend to be *very* short and tend to fit exactly
into the non-colliding space of bernstein. As a consequence, the collisions
"aren't too bad" and the runtime is actually faster. For longer strings,
Hashing.h will be *tremendously* faster than bernstein, but we never use
StringMap for that purpose in a critical path.

We should really use a totally different approach for the identifier maps
in Clang that is completely tailored to that (weird) use case, and then
StringMap could be usefully switched to the more robust hashing.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150318/8a80c105/attachment.html>


More information about the llvm-dev mailing list