[PATCH] D14053: Optimize StringTableBuilder

Rui Ueyama via llvm-commits llvm-commits at lists.llvm.org
Sun Oct 25 19:29:52 PDT 2015


ruiu created this revision.
ruiu added a reviewer: rafael.
ruiu added a subscriber: llvm-commits.

This is a patch to improve StringTableBuilder's performance. That class'
finalize function is very hot particularly in LLD because the function
does tail-merge strings for string tables or SHF_MERGE sections.

Generic std::sort-style sorter is not efficient for sorting strings.
The function implemented in this patch seems to be more efficient.

Here's a benchmark of LLD to link Clang with or without this patch.
The numbers are medians of 50 runs.

-O0
real    0m0.455s
real    0m0.430s (5.5% faster)
   
-O3
real    0m0.487s
real    0m0.452s (7.2% faster)

http://reviews.llvm.org/D14053

Files:
  lib/MC/StringTableBuilder.cpp

Index: lib/MC/StringTableBuilder.cpp
===================================================================
--- lib/MC/StringTableBuilder.cpp
+++ lib/MC/StringTableBuilder.cpp
@@ -18,29 +18,54 @@
 
 StringTableBuilder::StringTableBuilder(Kind K) : K(K) {}
 
-static int compareBySuffix(std::pair<StringRef, size_t> *const *AP,
-                           std::pair<StringRef, size_t> *const *BP) {
-  StringRef A = (*AP)->first;
-  StringRef B = (*BP)->first;
-  size_t SizeA = A.size();
-  size_t SizeB = B.size();
-  size_t Len = std::min(SizeA, SizeB);
-  for (size_t I = 0; I < Len; ++I) {
-    char CA = A[SizeA - I - 1];
-    char CB = B[SizeB - I - 1];
-    if (CA != CB)
-      return CB - CA;
+typedef std::pair<StringRef, size_t> StringPair;
+
+// Returns the character at Pos from end of a string.
+static int charTailAt(StringPair *P, size_t Pos) {
+  StringRef S = P->first;
+  if (Pos >= S.size())
+    return -1;
+  return (unsigned char)S[S.size() - Pos - 1];
+}
+
+// Three-way string quicksort. This is much faster than std::sort with strcmp
+// because it does not compare characters that we already know the same.
+static void qsort(StringPair **Begin, StringPair **End, int Pos) {
+  if (End - Begin < 2)
+    return;
+
+  // Safeguard for pathetic input.
+  if (Pos > 10000)
+    return;
+
+  // Partition items. Items in [Begin, P) are less than the pivot, [P, Q)
+  // are the same as the pivot, and [Q, End) are greater than the pivot.
+  int Pivot = charTailAt(*Begin, Pos);
+  StringPair **P = Begin;
+  StringPair **Q = End;
+  for (StringPair **R = Begin + 1; R < Q;) {
+    int C = charTailAt(*R, Pos);
+    if (C > Pivot)
+      std::swap(*P++, *R++);
+    else if (C < Pivot)
+      std::swap(*--Q, *R);
+    else
+      R++;
   }
-  return SizeB - SizeA;
+  qsort(Begin, P, Pos);
+  if (Pivot != -1)
+    qsort(P, Q, Pos + 1);
+  qsort(Q, End, Pos);
 }
 
 void StringTableBuilder::finalize() {
   std::vector<std::pair<StringRef, size_t> *> Strings;
   Strings.reserve(StringIndexMap.size());
   for (std::pair<StringRef, size_t> &P : StringIndexMap)
     Strings.push_back(&P);
 
-  array_pod_sort(Strings.begin(), Strings.end(), compareBySuffix);
+  if (!Strings.empty())
+    qsort(&Strings[0], &Strings[0] + Strings.size(), 0);
 
   switch (K) {
   case RAW:


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D14053.38370.patch
Type: text/x-patch
Size: 2293 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151026/fcbc1884/attachment.bin>


More information about the llvm-commits mailing list