[Lldb-commits] [lldb] [lldb][Mach-O] Bound export-trie symbol name length (PR #202947)

via lldb-commits lldb-commits at lists.llvm.org
Wed Jun 10 05:17:24 PDT 2026


llvmorg-github-actions[bot] wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-lldb

Author: Yao Qi (qiyao)

<details>
<summary>Changes</summary>

`ParseTrieEntries` assembles a symbol name by appending every edge label
along a trie path into a `std::string`. A corrupt export trie can encode an
edge label whose terminator is far away in the trie data, making a single
label many megabytes long. Appending it requests an unbounded allocation,
which can crash lldb while parsing the symbol table.

Reject a trie whose assembled name exceeds a sane bound (1 MiB) as corrupt
data, the same way an unterminated edge label is already handled. Add a
unit test covering an oversized edge label.

Assisted-by: Claude


---
Full diff: https://github.com/llvm/llvm-project/pull/202947.diff


2 Files Affected:

- (modified) lldb/source/Plugins/ObjectFile/Mach-O/MachOTrie.cpp (+9) 
- (modified) lldb/unittests/ObjectFile/MachO/MachOTrieTest.cpp (+15) 


``````````diff
diff --git a/lldb/source/Plugins/ObjectFile/Mach-O/MachOTrie.cpp b/lldb/source/Plugins/ObjectFile/Mach-O/MachOTrie.cpp
index bac3ae4b49abb..a0b42080d6725 100644
--- a/lldb/source/Plugins/ObjectFile/Mach-O/MachOTrie.cpp
+++ b/lldb/source/Plugins/ObjectFile/Mach-O/MachOTrie.cpp
@@ -20,6 +20,13 @@ using namespace lldb;
 using namespace lldb_private;
 using namespace llvm::MachO;
 
+// Upper bound on the length of a symbol name assembled from export-trie edge
+// labels. A corrupt trie can encode an edge label whose terminator is far away
+// in the trie data, so a single label is many megabytes long; appending it to
+// the running name would otherwise request an unbounded allocation. No
+// legitimate symbol name comes close to this size.
+static constexpr size_t kMaxTrieSymbolNameLength = 1 << 20; // 1 MiB
+
 void TrieEntry::Dump() const {
   printf("0x%16.16llx 0x%16.16llx 0x%16.16llx \"%s\"",
          static_cast<unsigned long long>(address),
@@ -118,6 +125,8 @@ bool ParseTrieEntriesImpl(DataExtractor &data, lldb::offset_t offset,
     const char *cstr = data.GetCStr(&children_offset);
     if (!cstr)
       return false; // Corrupt data
+    if (prefix.size() + llvm::StringRef(cstr).size() > kMaxTrieSymbolNameLength)
+      return false; // Corrupt data: implausibly long symbol name.
     const size_t prevSize = prefix.size();
     prefix.append(cstr);
     lldb::offset_t childNodeOffset = data.GetULEB128(&children_offset);
diff --git a/lldb/unittests/ObjectFile/MachO/MachOTrieTest.cpp b/lldb/unittests/ObjectFile/MachO/MachOTrieTest.cpp
index 31f6dee913a76..767f515d1e505 100644
--- a/lldb/unittests/ObjectFile/MachO/MachOTrieTest.cpp
+++ b/lldb/unittests/ObjectFile/MachO/MachOTrieTest.cpp
@@ -335,3 +335,18 @@ TEST(MachOTrieTest, MalformedBackEdgeCycle) {
   ParseResult result = Parse(t);
   EXPECT_FALSE(result.ok);
 }
+
+TEST(MachOTrieTest, OversizedEdgeLabelIsRejected) {
+  // A corrupt export trie can encode an edge label far longer than any real
+  // symbol name. ParseTrieEntries appends every edge label onto the running
+  // symbol name without a length bound, so such a label drives an unbounded
+  // allocation. The parser must treat an implausibly long name as corrupt data
+  // and bail instead of accepting it.
+  constexpr size_t kOversizedLabelLen = 8 * 1024 * 1024;
+  TrieBuilder b;
+  b.AddExport(b.Root(), std::string(kOversizedLabelLen, 'A'), 0x1000);
+
+  ParseResult result = Parse(b.Build());
+  EXPECT_FALSE(result.ok);
+  EXPECT_TRUE(result.ext_symbols.empty());
+}

``````````

</details>


https://github.com/llvm/llvm-project/pull/202947


More information about the lldb-commits mailing list