[llvm] [DebugNames] Compare TableEntry names more efficiently (PR #79759)

Felipe de Azevedo Piovezan via llvm-commits llvm-commits at lists.llvm.org
Sun Jan 28 09:03:32 PST 2024


https://github.com/felipepiovezan updated https://github.com/llvm/llvm-project/pull/79759

>From 996ccb6f387796b3bb29606d8f09c7721c059711 Mon Sep 17 00:00:00 2001
From: Felipe de Azevedo Piovezan <fpiovezan at apple.com>
Date: Tue, 23 Jan 2024 16:26:34 -0800
Subject: [PATCH] [DebugNames] Compare TableEntry names more efficiently

TableEntry names are pointers into the string table section, and accessing their
length requires a search for `\0`. However, 99% of the time we only need to
compare the name against some other other, and such a comparison will fail as
early as the first character.

This commit adds a method to the interface of TableEntry so that such a
comparison can be done without extracting the full name. It saves 10% in the
time (1250ms -> 1100 ms) to evaluate the following expression.

```
lldb \
  --batch \
  -o "b CodeGenFunction::GenerateCode" \
  -o run \
  -o "expr Fn" \
  -- \
  clang++ -c -g test.cpp -o /dev/null &> output
```

Why not use strcmp? This function requires both operands to be null terminated.
This is true for the strp entry -- and LLDB seems to always assume this -- but
could lead to buffer overruns in corrupt data. This is also true for the
"Target" string in the two uses of this function introduced here, but may not
necessarily be true in the future; we would need to change the API from
`StringRef` to `std::string` to emphasize this point.

Why not use strncmp? We can't use the "N" argument of strnmp to be
hwstd::min(<debug_str_starting_at_name>.size(), Target.size())`, as this would
return "equal" when Target is a prefix of Name. To work around this, we would
need to require `Target` to be null-terminated.
---
 .../DebugInfo/DWARF/DWARFAcceleratorTable.h   | 21 +++++++++++++++++++
 .../DebugInfo/DWARF/DWARFAcceleratorTable.cpp |  4 ++--
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
index 44a19c7b13f9a7b..ffb013e59cbe509 100644
--- a/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
@@ -543,6 +543,27 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
       return StrData.getCStr(&Off);
     }
 
+    // Compares the name of this entry against Target, returning true if they
+    // are equal. This is helpful is hot code paths that do not need the length
+    // of the name.
+    bool compareNameAgainst(StringRef Target) const {
+      // Note: this is not the name, but the rest of debug_names starting from
+      // name. This handles corrupt data (non-null terminated) without
+      // overruning the buffer.
+      auto Data = StrData.getData().substr(StringOffset);
+      auto DataSize = Data.size();
+      auto TargetSize = Target.size();
+
+      // Invariant: at the start of the loop, we have non-null characters to
+      // read from Data.
+      size_t Idx = 0;
+      for (; Idx < DataSize && Data[Idx]; Idx++) {
+        if (Idx >= TargetSize || Data[Idx] != Target[Idx])
+          return false;
+      }
+      return Idx == TargetSize;
+    }
+
     /// Returns the offset of the first Entry in the list.
     uint64_t getEntryOffset() const { return EntryOffset; }
   };
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp b/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
index 03ad5d133caddf4..30ac0c0d1507aba 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
@@ -921,7 +921,7 @@ DWARFDebugNames::ValueIterator::findEntryOffsetInCurrentIndex() {
   if (Hdr.BucketCount == 0) {
     // No Hash Table, We need to search through all names in the Name Index.
     for (const NameTableEntry &NTE : *CurrentIndex) {
-      if (NTE.getString() == Key)
+      if (NTE.compareNameAgainst(Key))
         return NTE.getEntryOffset();
     }
     return std::nullopt;
@@ -942,7 +942,7 @@ DWARFDebugNames::ValueIterator::findEntryOffsetInCurrentIndex() {
       return std::nullopt; // End of bucket
 
     NameTableEntry NTE = CurrentIndex->getNameTableEntry(Index);
-    if (NTE.getString() == Key)
+    if (NTE.compareNameAgainst(Key))
       return NTE.getEntryOffset();
   }
   return std::nullopt;



More information about the llvm-commits mailing list