<div dir="ltr">Should be a performance problem though, leading to more hash collisions than necessary. Why is the hash case insensitive anyway?</div><br><div class="gmail_quote"><div dir="ltr">On Thu, May 25, 2017 at 1:15 AM George Rimar <<a href="mailto:grimar@accesssoftek.com">grimar@accesssoftek.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">>The hash function being used is<br>

><br>

>static uint32_t hash(StringRef Str) {<br>

>  uint32_t R = 0;<br>

>  for (uint8_t C : Str)<br>

>    R = R * 67 + tolower(C) - 113;<br>

>  return R;<br>

>}<br>

><br>

>What happens we have the string "Foo" and "foo"? Do we have a guarantee<br>

>that that never happens?<br>

><br>

>Cheers,<br>

>Rafael<br>

<br>

It is not a problem here. See llvm::StringTableBuilder implements both add() via CachedHashStringRef:<br>

  size_t add(CachedHashStringRef S);<br>

  size_t add(StringRef S) { return add(CachedHashStringRef(S)); }<br>

<br>

So collision of hashes can happen in both implementations actually I believe.<br>

But that is not harmfull, because internal implementation uses<br>

DenseMap<CachedHashStringRef, size_t> StringIndexMap to store strings.<br>

<br>

And 2 CachedHashStringRef are equal only when both their hash and values are equal:<br>

  static bool isEqual(const CachedHashStringRef &LHS,<br>

                      const CachedHashStringRef &RHS) {<br>

    return LHS.hash() == RHS.hash() &&<br>

           DenseMapInfo<StringRef>::isEqual(LHS.val(), RHS.val());<br>

  }<br>

<br>

So I mean that following code would produce 2 different entries anyways:<br>

  llvm::StringTableBuilder Bar(StringTableBuilder::ELF, 1);<br>

  Bar.add({StringRef("foo"), 2});<br>

  Bar.add({StringRef("Foo"), 2});<br>

<br>

George.</blockquote></div>