<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Oct 14, 2016 at 11:23 AM, Rafael Espíndola via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I have put some effort to try to speed up StringTableBuilder. The last<br>

thing that worked was committed as r284249.<br>

<br>

The main difficulty in optimizing it is the large number of strings it<br>

has to handle. In the case of xul, one of the string tables has<br>

14_375_801 strings added, out of which only 1_726_762 are unique.<br></blockquote><div><br>Where is time being spent?<br><br>It sounds like in this case, most of the hash lookups are hits?</div><div><br></div><div>It's possible to make it so there are no hash calculations that fail, but if we aren't spending time in either:<br><br></div><div>hash misses</div><div>or</div><div>hash calculation</div><div><br></div><div>this is going to be hard to beat.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

The things I tried:<br>

<br>

* Instead of returning size_t from add, pass in a size_t* to add. This<br>

allows us to remember all StringRefs and only do the merging in<br>

finalize. This would be extra helpful for the -O2 case by not needing<br>

an extra hash lookup. The idea for -O1 was to avoid hash resizing and<br>

improve cache hit by calling StringIndexMap.reserve. Unfortunately<br>

given how many strings are duplicated this is not profitable.<br>

<br>

* Using a DenseSet with just an unsigned in it and a side std::vector<br>

with the rest of the information. The idea is that the vector doesn't<br>

contain empty keys, so it should be denser. This reduced the cache<br>

misses accessing the set, but the extra vector compensated for it.<br>

<br>

* Creating the string buffer incrementally in add. The idea is that<br>

then we don't need to store the pointer to the string. We can find out<br>

what the string is with just the offset. This was probably the most<br>

promising. It reduced the total number of cache misses reported by<br>

perf stat, but the overall time didn't improve significantly. This<br>

also makes -O2 substantially harder to implement. I have attached the<br>

patch that implements this (note that -O2 is not implemented).<br>

<br>

* Not merging constants to see if special casing them would make a<br>

difference. No test speeds ups by even 1%.<br>

<br>

In summary, it seems the string merging at -O1 is as fast as it gets<br>

short of someone knowing a crazy algorithm. At -O2 it should be<br>

possible to avoid the second hash lookup by passing a size_t* to add,<br>

but it is not clear if that would be worth the code complexity.<br>

<br>

Cheers,<br>

Rafael<br>

<br>______________________________<wbr>_________________<br>

llvm-commits mailing list<br>

<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-commits</a><br>

<br></blockquote></div><br></div></div>