[llvm] [llvm-symbolizer] Have --obj participate in markup resolution (PR #183444)

Joshua Seaton via llvm-commits llvm-commits at lists.llvm.org
Sun Mar 8 18:25:45 PDT 2026


https://github.com/joshuaseaton updated https://github.com/llvm/llvm-project/pull/183444

>From 4dbb209133aa7c422a6cfdfa7fbff97bf42d3cbd Mon Sep 17 00:00:00 2001
From: Joshua Seaton <josh.a.seaton at gmail.com>
Date: Wed, 25 Feb 2026 21:36:05 -0800
Subject: [PATCH 1/2] [llvm-symbolizer] Have --obj participate in markup
 resolution

Sometimes in development one is looking to symbolize the backtrace from
an executable that's *right there*. It is ergonomic to be able to simply
pass to to llvm-symbolizer via --obj, while in that case installing it
in a build ID directory and passing it through indirection is less than.
In this change we update --obj and --debug-file-directory to be able to
both independently contribute to the set of binaries to participating in
markup resolution.
---
 llvm/lib/DebugInfo/Symbolize/Symbolize.cpp     |  6 ++++++
 .../DebugInfo/symbolize-filter-markup-obj.test | 16 ++++++++++++++++
 llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp | 18 +++++++++++-------
 3 files changed, 33 insertions(+), 7 deletions(-)
 create mode 100644 llvm/test/DebugInfo/symbolize-filter-markup-obj.test

diff --git a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
index 3821f53d26b98..8276d50fe0158 100644
--- a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
+++ b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
@@ -769,6 +769,12 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) {
   }
   ObjectPair Objects = ObjectsOrErr.get();
 
+  // Register the build ID so that build-ID-based lookups (e.g., from
+  // --filter-markup) can resolve to this path.
+  BuildIDRef BID = getBuildID(Objects.first);
+  if (!BID.empty())
+    BuildIDPaths[getBuildIDStr(BID)] = std::string(BinaryName);
+
   std::unique_ptr<DIContext> Context;
   // If this is a COFF object containing PDB info and not containing DWARF
   // section, use a PDBContext to symbolize. Otherwise, use DWARF.
diff --git a/llvm/test/DebugInfo/symbolize-filter-markup-obj.test b/llvm/test/DebugInfo/symbolize-filter-markup-obj.test
new file mode 100644
index 0000000000000..017295d400bd5
--- /dev/null
+++ b/llvm/test/DebugInfo/symbolize-filter-markup-obj.test
@@ -0,0 +1,16 @@
+## Test that --obj populates the build ID cache for --filter-markup.
+## The debug binary is reused from symbolize-build-id.test.
+
+# RUN: split-file %s %t
+# RUN: llvm-symbolizer --obj=%p/Inputs/.build-id/ab/b50d82b6bdc861.debug \
+# RUN:   --filter-markup < %t/input > %t.output 2>&1
+# RUN: FileCheck %s --input-file=%t.output --match-full-lines \
+# RUN:   --implicit-check-not {{.}}
+
+# CHECK: [[BEGIN:\[{3}]]ELF module #0x0 "a.o"; BuildID=abb50d82b6bdc861 [0x0-0x2fffff](r)[[END:\]{3}]]
+# CHECK: main[/tmp/dbginfo[[SEP:[/\\]]]dwarfdump-test.cc:16]
+
+;--- input
+{{{module:0:a.o:elf:abb50d82b6bdc861}}}
+{{{mmap:0:0x300000:load:0:r:0}}}
+{{{pc:0x20112f}}}
diff --git a/llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp b/llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
index 4784dafeb2948..dfad7f991ac64 100644
--- a/llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
+++ b/llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
@@ -463,7 +463,8 @@ static object::BuildID parseBuildIDArg(const opt::InputArgList &Args, int ID) {
 }
 
 // Symbolize markup from stdin and write the result to stdout.
-static void filterMarkup(const opt::InputArgList &Args, LLVMSymbolizer &Symbolizer) {
+static void filterMarkup(const opt::InputArgList &Args,
+                         LLVMSymbolizer &Symbolizer) {
   MarkupFilter Filter(outs(), Symbolizer, parseColorArg(Args));
   std::string InputString;
   while (std::getline(std::cin, InputString)) {
@@ -540,11 +541,6 @@ int llvm_symbolizer_main(int argc, char **argv, const llvm::ToolContext &) {
   if (Args.hasFlag(OPT_debuginfod, OPT_no_debuginfod, canUseDebuginfod()))
     enableDebuginfod(Symbolizer, Args);
 
-  if (Args.hasArg(OPT_filter_markup)) {
-    filterMarkup(Args, Symbolizer);
-    return 0;
-  }
-
   auto Style = IsAddr2Line ? OutputStyle::GNU : OutputStyle::LLVM;
   if (const opt::Arg *A = Args.getLastArg(OPT_output_style_EQ)) {
     if (strcmp(A->getValue(), "GNU") == 0)
@@ -559,7 +555,6 @@ int llvm_symbolizer_main(int argc, char **argv, const llvm::ToolContext &) {
     errs() << "error: cannot specify both --build-id and --obj\n";
     return EXIT_FAILURE;
   }
-  object::BuildID BuildID = parseBuildIDArg(Args, OPT_build_id_EQ);
 
   std::unique_ptr<DIPrinter> Printer;
   if (Style == OutputStyle::GNU)
@@ -583,6 +578,15 @@ int llvm_symbolizer_main(int argc, char **argv, const llvm::ToolContext &) {
     }
   }
 
+  // This is sequenced after object loading so that the build ID cache can
+  // participate in the resolution of modules referenced in the markup.
+  if (Args.hasArg(OPT_filter_markup)) {
+    filterMarkup(Args, Symbolizer);
+    return 0;
+  }
+
+  object::BuildID BuildID = parseBuildIDArg(Args, OPT_build_id_EQ);
+
   std::vector<std::string> InputAddresses = Args.getAllArgValues(OPT_INPUT);
   if (InputAddresses.empty()) {
     const int kMaxInputStringLength = 1024;

>From 05a73f1d48f087daa8ee25236441027328aa2fc7 Mon Sep 17 00:00:00 2001
From: Joshua Seaton <josh.a.seaton at gmail.com>
Date: Sun, 8 Mar 2026 18:17:46 -0700
Subject: [PATCH 2/2] [llvm-symbolizer] Have --obj participate in markup
 resolution

This change concerns the interplay between --filter-markup and --obj.
Today the --filter-markup path does not consult any of the --obj
specifications when it comes to build ID lookup. We update things so
that it does.

Two reasons for doing so:

* Sometimes in development one is looking to symbolize the backtrace
that an executable that's *right there*. It is ergonomic to be able to
simplypass to to llvm-symbolizer via --obj, while in that case
installing it in a build ID directory and passing it through indirection
is less than.

* In the current command-line interface, it feels natural for --obj
--debug-file-directory to both be able to independently contribute to
the set of binaries participating in markup resolution.
---
 llvm/lib/DebugInfo/Symbolize/Symbolize.cpp | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
index 8276d50fe0158..84da834cef183 100644
--- a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
+++ b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
@@ -555,8 +555,17 @@ LLVMSymbolizer::getOrCreateObjectPair(const std::string &Path,
     DbgObj = lookUpBuildIDObject(Path, ELFObj, ArchName);
   if (!DbgObj)
     DbgObj = lookUpDebuglinkObject(Path, Obj, ArchName);
-  if (!DbgObj)
+  if (!DbgObj) {
+    // No previously cached debug binary found. As a best-effort, we use the
+    // provided object instead, and also register its build ID (if any) so that
+    // build-ID-based lookups (e.g., from --filter-markup) can resolve to this
+    // path.
     DbgObj = Obj;
+
+    BuildIDRef BID = getBuildID(Obj);
+    if (!BID.empty())
+      BuildIDPaths[getBuildIDStr(BID)] = std::string(Path);
+  }
   ObjectPair Res = std::make_pair(Obj, DbgObj);
   auto Pair =
       ObjectPairForPathArch.emplace(std::make_pair(Path, ArchName), Res);



More information about the llvm-commits mailing list