[PATCH] D33488: [ELF] - Optimization for populating stringpool when building .gdb_index.

David Blaikie via llvm-commits llvm-commits at lists.llvm.org
Thu May 25 10:58:21 PDT 2017


Should be a performance problem though, leading to more hash collisions
than necessary. Why is the hash case insensitive anyway?

On Thu, May 25, 2017 at 1:15 AM George Rimar <grimar at accesssoftek.com>
wrote:

> >The hash function being used is
> >
> >static uint32_t hash(StringRef Str) {
> >  uint32_t R = 0;
> >  for (uint8_t C : Str)
> >    R = R * 67 + tolower(C) - 113;
> >  return R;
> >}
> >
> >What happens we have the string "Foo" and "foo"? Do we have a guarantee
> >that that never happens?
> >
> >Cheers,
> >Rafael
>
> It is not a problem here. See llvm::StringTableBuilder implements both
> add() via CachedHashStringRef:
>   size_t add(CachedHashStringRef S);
>   size_t add(StringRef S) { return add(CachedHashStringRef(S)); }
>
> So collision of hashes can happen in both implementations actually I
> believe.
> But that is not harmfull, because internal implementation uses
> DenseMap<CachedHashStringRef, size_t> StringIndexMap to store strings.
>
> And 2 CachedHashStringRef are equal only when both their hash and values
> are equal:
>   static bool isEqual(const CachedHashStringRef &LHS,
>                       const CachedHashStringRef &RHS) {
>     return LHS.hash() == RHS.hash() &&
>            DenseMapInfo<StringRef>::isEqual(LHS.val(), RHS.val());
>   }
>
> So I mean that following code would produce 2 different entries anyways:
>   llvm::StringTableBuilder Bar(StringTableBuilder::ELF, 1);
>   Bar.add({StringRef("foo"), 2});
>   Bar.add({StringRef("Foo"), 2});
>
> George.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170525/da866830/attachment.html>


More information about the llvm-commits mailing list