[llvm] r251337 - Optimize StringTableBuilder.

Sean Silva via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 26 15:24:15 PDT 2015


On Mon, Oct 26, 2015 at 2:55 PM, Rui Ueyama <ruiu at google.com> wrote:

> On Mon, Oct 26, 2015 at 2:47 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>>
>>
>> On Mon, Oct 26, 2015 at 12:58 PM, Rui Ueyama via llvm-commits <
>> llvm-commits at lists.llvm.org> wrote:
>>
>>> Author: ruiu
>>> Date: Mon Oct 26 14:58:29 2015
>>> New Revision: 251337
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=251337&view=rev
>>> Log:
>>> Optimize StringTableBuilder.
>>>
>>> This is a patch to improve StringTableBuilder's performance. That class'
>>> finalize function is very hot particularly in LLD because the function
>>> does tail-merge strings in string tables or SHF_MERGE sections.
>>>
>>> Generic std::sort-style sorter is not efficient for sorting strings.
>>> The function implemented in this patch seems to be more efficient.
>>>
>>
>> array_pod_sort uses a qsort-style comparator. Where is the peformance
>> difference you are measuring coming from?
>>
>
> The new algorithm does not compare common prefixes once we know that that
> are common. https://en.wikipedia.org/wiki/Multi-key_quicksort
>

Ah, neat.


>
>
>>
>> Also, did you measure using std::sort? Generally std::sort is generally
>> faster due to inlining.
>>
>
> I didn't.
>

Probably worth a shot. Writing a custom sorting algorithm is a pretty
drastic measure to get a performance boost.

-- Sean Silva


>
>
>>
>> E.g.
>>
>> std::sort(&Strings[0], &Strings[0] + Strings.size(), [](StringPair LHS,
>> StringPair RHS) {
>>   auto rbegin = [](StringPair SP) { return
>> std::reverse_iterator(SP.first.end()); };
>>   auto rend = [](StringPair SP) { return
>> std::reverse_iterator(SP.first.first()); };
>>   return std::lexicographical_compare(rbegin(LHS), rend(LHS),
>> rbegin(RHS), rend(RHS));
>> });
>>
>> -- Sean Silva
>>
>>
>>
>>>
>>> Here's a benchmark of LLD to link Clang with or without this patch.
>>> The numbers are medians of 50 runs.
>>>
>>> -O0
>>> real 0m0.455s
>>> real 0m0.430s (5.5% faster)
>>>
>>> -O3
>>> real 0m0.487s
>>> real 0m0.452s (7.2% faster)
>>>
>>> Since that is a benchmark of the whole linker, the speedup of
>>> StringTableBuilder itself is much more than that.
>>>
>>> http://reviews.llvm.org/D14053
>>>
>>>
>>> Modified:
>>>     llvm/trunk/lib/MC/StringTableBuilder.cpp
>>>
>>> Modified: llvm/trunk/lib/MC/StringTableBuilder.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/StringTableBuilder.cpp?rev=251337&r1=251336&r2=251337&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/MC/StringTableBuilder.cpp (original)
>>> +++ llvm/trunk/lib/MC/StringTableBuilder.cpp Mon Oct 26 14:58:29 2015
>>> @@ -18,20 +18,47 @@ using namespace llvm;
>>>
>>>  StringTableBuilder::StringTableBuilder(Kind K) : K(K) {}
>>>
>>> -static int compareBySuffix(std::pair<StringRef, size_t> *const *AP,
>>> -                           std::pair<StringRef, size_t> *const *BP) {
>>> -  StringRef A = (*AP)->first;
>>> -  StringRef B = (*BP)->first;
>>> -  size_t SizeA = A.size();
>>> -  size_t SizeB = B.size();
>>> -  size_t Len = std::min(SizeA, SizeB);
>>> -  for (size_t I = 0; I < Len; ++I) {
>>> -    char CA = A[SizeA - I - 1];
>>> -    char CB = B[SizeB - I - 1];
>>> -    if (CA != CB)
>>> -      return CB - CA;
>>> +typedef std::pair<StringRef, size_t> StringPair;
>>> +
>>> +// Returns the character at Pos from end of a string.
>>> +static int charTailAt(StringPair *P, size_t Pos) {
>>> +  StringRef S = P->first;
>>> +  if (Pos >= S.size())
>>> +    return -1;
>>> +  return (unsigned char)S[S.size() - Pos - 1];
>>> +}
>>> +
>>> +// Three-way radix quicksort. This is much faster than std::sort with
>>> strcmp
>>> +// because it does not compare characters that we already know the same.
>>> +static void qsort(StringPair **Begin, StringPair **End, int Pos) {
>>> +tailcall:
>>> +  if (End - Begin <= 1)
>>> +    return;
>>> +
>>> +  // Partition items. Items in [Begin, P) are greater than the pivot,
>>> +  // [P, Q) are the same as the pivot, and [Q, End) are less than the
>>> pivot.
>>> +  int Pivot = charTailAt(*Begin, Pos);
>>> +  StringPair **P = Begin;
>>> +  StringPair **Q = End;
>>> +  for (StringPair **R = Begin + 1; R < Q;) {
>>> +    int C = charTailAt(*R, Pos);
>>> +    if (C > Pivot)
>>> +      std::swap(*P++, *R++);
>>> +    else if (C < Pivot)
>>> +      std::swap(*--Q, *R);
>>> +    else
>>> +      R++;
>>> +  }
>>> +
>>> +  qsort(Begin, P, Pos);
>>> +  qsort(Q, End, Pos);
>>> +  if (Pivot != -1) {
>>> +    // qsort(P, Q, Pos + 1), but with tail call optimization.
>>> +    Begin = P;
>>> +    End = Q;
>>> +    ++Pos;
>>> +    goto tailcall;
>>>    }
>>> -  return SizeB - SizeA;
>>>  }
>>>
>>>  void StringTableBuilder::finalize() {
>>> @@ -40,7 +67,8 @@ void StringTableBuilder::finalize() {
>>>    for (std::pair<StringRef, size_t> &P : StringIndexMap)
>>>      Strings.push_back(&P);
>>>
>>> -  array_pod_sort(Strings.begin(), Strings.end(), compareBySuffix);
>>> +  if (!Strings.empty())
>>> +    qsort(&Strings[0], &Strings[0] + Strings.size(), 0);
>>>
>>>    switch (K) {
>>>    case RAW:
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151026/7517b305/attachment-0001.html>


More information about the llvm-commits mailing list