[PATCH] D33488: [ELF] - Optimization for populating stringpool when building .gdb_index.
George Rimar via llvm-commits
llvm-commits at lists.llvm.org
Thu May 25 01:15:39 PDT 2017
>The hash function being used is
>
>static uint32_t hash(StringRef Str) {
> uint32_t R = 0;
> for (uint8_t C : Str)
> R = R * 67 + tolower(C) - 113;
> return R;
>}
>
>What happens we have the string "Foo" and "foo"? Do we have a guarantee
>that that never happens?
>
>Cheers,
>Rafael
It is not a problem here. See llvm::StringTableBuilder implements both add() via CachedHashStringRef:
size_t add(CachedHashStringRef S);
size_t add(StringRef S) { return add(CachedHashStringRef(S)); }
So collision of hashes can happen in both implementations actually I believe.
But that is not harmfull, because internal implementation uses
DenseMap<CachedHashStringRef, size_t> StringIndexMap to store strings.
And 2 CachedHashStringRef are equal only when both their hash and values are equal:
static bool isEqual(const CachedHashStringRef &LHS,
const CachedHashStringRef &RHS) {
return LHS.hash() == RHS.hash() &&
DenseMapInfo<StringRef>::isEqual(LHS.val(), RHS.val());
}
So I mean that following code would produce 2 different entries anyways:
llvm::StringTableBuilder Bar(StringTableBuilder::ELF, 1);
Bar.add({StringRef("foo"), 2});
Bar.add({StringRef("Foo"), 2});
George.
More information about the llvm-commits
mailing list