[llvm-branch-commits] [llvm] e96c444 - [SymbolSize] Improve the performance of SymbolSize computation

Tobias Hieta via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Mon Aug 7 00:06:51 PDT 2023


Author: Steven Wu
Date: 2023-08-07T09:05:10+02:00
New Revision: e96c444fd725661e6273d1708dfd10f2b6c3de6b

URL: https://github.com/llvm/llvm-project/commit/e96c444fd725661e6273d1708dfd10f2b6c3de6b
DIFF: https://github.com/llvm/llvm-project/commit/e96c444fd725661e6273d1708dfd10f2b6c3de6b.diff

LOG: [SymbolSize] Improve the performance of SymbolSize computation

The current algorithm to compute the symbol size is quadratic if there
are lots of symbols sharing the same addresses. This happens in a debug
build when lots of debug symbols get emitted in the symtab.

This patch improves the performance like `llvm-symbolizer` that relies
on the symbol size computation. Symbolizing a release+assert clang with
DebugInfo sees significant improvements from 3:40min to less than 1s.

Reviewed By: pete, mehdi_amini, arsenm, MaskRay

Differential Revision: https://reviews.llvm.org/D156603

(cherry picked from commit f5974e80653db977913bceffca7e900e818ef872)

Added: 
    

Modified: 
    llvm/lib/Object/SymbolSize.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Object/SymbolSize.cpp b/llvm/lib/Object/SymbolSize.cpp
index e42dbe6f47ab29..eee5505b8c1414 100644
--- a/llvm/lib/Object/SymbolSize.cpp
+++ b/llvm/lib/Object/SymbolSize.cpp
@@ -84,16 +84,21 @@ llvm::object::computeSymbolSizes(const ObjectFile &O) {
 
   array_pod_sort(Addresses.begin(), Addresses.end(), compareAddress);
 
-  // Compute the size as the gap to the next symbol
-  for (unsigned I = 0, N = Addresses.size() - 1; I < N; ++I) {
+  // Compute the size as the gap to the next symbol. If multiple symbols have
+  // the same address, give both the same size. Because Addresses is sorted,
+  // using two pointers to keep track of the current symbol vs. the next symbol
+  // that doesn't have the same address for size computation.
+  for (unsigned I = 0, NextI = 0, N = Addresses.size() - 1; I < N; ++I) {
     auto &P = Addresses[I];
     if (P.I == O.symbol_end())
       continue;
 
-    // If multiple symbol have the same address, give both the same size.
-    unsigned NextI = I + 1;
-    while (NextI < N && Addresses[NextI].Address == P.Address)
-      ++NextI;
+    // If the next pointer is behind, update it to the next symbol.
+    if (NextI <= I) {
+      NextI = I + 1;
+      while (NextI < N && Addresses[NextI].Address == P.Address)
+        ++NextI;
+    }
 
     uint64_t Size = Addresses[NextI].Address - P.Address;
     P.Address = Size;


        


More information about the llvm-branch-commits mailing list