[PATCH] D33488: [ELF] - Optimization for populating stringpool when building .gdb_index.
David Blaikie via llvm-commits
llvm-commits at lists.llvm.org
Thu May 25 10:58:21 PDT 2017
Should be a performance problem though, leading to more hash collisions
than necessary. Why is the hash case insensitive anyway?
On Thu, May 25, 2017 at 1:15 AM George Rimar <grimar at accesssoftek.com>
wrote:
> >The hash function being used is
> >
> >static uint32_t hash(StringRef Str) {
> > uint32_t R = 0;
> > for (uint8_t C : Str)
> > R = R * 67 + tolower(C) - 113;
> > return R;
> >}
> >
> >What happens we have the string "Foo" and "foo"? Do we have a guarantee
> >that that never happens?
> >
> >Cheers,
> >Rafael
>
> It is not a problem here. See llvm::StringTableBuilder implements both
> add() via CachedHashStringRef:
> size_t add(CachedHashStringRef S);
> size_t add(StringRef S) { return add(CachedHashStringRef(S)); }
>
> So collision of hashes can happen in both implementations actually I
> believe.
> But that is not harmfull, because internal implementation uses
> DenseMap<CachedHashStringRef, size_t> StringIndexMap to store strings.
>
> And 2 CachedHashStringRef are equal only when both their hash and values
> are equal:
> static bool isEqual(const CachedHashStringRef &LHS,
> const CachedHashStringRef &RHS) {
> return LHS.hash() == RHS.hash() &&
> DenseMapInfo<StringRef>::isEqual(LHS.val(), RHS.val());
> }
>
> So I mean that following code would produce 2 different entries anyways:
> llvm::StringTableBuilder Bar(StringTableBuilder::ELF, 1);
> Bar.add({StringRef("foo"), 2});
> Bar.add({StringRef("Foo"), 2});
>
> George.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170525/da866830/attachment.html>
More information about the llvm-commits
mailing list