[Lldb-commits] [lldb] [DRAFT] [lldb][Expression] Encode Module and DIE UIDs into function AsmLabels (PR #148877)

Pavel Labath via lldb-commits lldb-commits at lists.llvm.org
Wed Jul 16 00:11:09 PDT 2025


================
@@ -249,6 +250,41 @@ static unsigned GetCXXMethodCVQuals(const DWARFDIE &subprogram,
   return cv_quals;
 }
 
+// TODO:
+// 0. Adjust FindInSymbols
+// 1. log failure paths
+// 2. What happens for functions without a linkage name? Previously we didn't
+// attach a label for those but now we would
+// 3. Unit-test
+// 4. API test (whilch checks expr and AST dump)
+static std::optional<std::string> MakeLLDBFuncAsmLabel(const DWARFDIE &die) {
+  std::optional<std::string> label;
+  char const *mangled = die.GetMangledName(/*substitute_name_allowed=*/false);
+  if (mangled)
+    label.emplace(mangled);
+
+  auto module_sp = die.GetModule();
+  if (!module_sp)
+    return label;
+
+  // Module UID is only a Darwin concept (?)
+  // If UUID is not available encode as pointer.
+  // Maybe add character to signal whether this is a pointer
+  // or UUID. Or maybe if it's not hex that implies a UUID?
+  auto module_id = module_sp->GetUUID();
+  Module * module_ptr = nullptr;
+  if (!module_id.IsValid())
+    module_ptr = module_sp.get();
----------------
labath wrote:

> (e.g., on Linux IIUC?)

It's more common there, but even on darwin you can create binaries without a UUID (`ld --no_uuid`). And even if there is one, there's no guarantee it's going to be truly unique.

> was your original suggestion to simply print out the pointer? If so, how would we defend against the module getting unloaded under our feet? I might've misunderstood your idea though.

Yes, that was the idea, though I can't say I have given much thought about this aspect. Intuitively, I would expect this to be fine. Since the module pointer is only encoded in the AST inside that module, the fact that we got that pointer here means that the module was present at the beginning of the compilation process. The pointer is consumed after clang is done (which happens midway into compilation), and I'd hope that noone can unload a module between those two points.

That said, I admit this is risky, and I'm not pushing for this particular encoding. In particular, it may enable someone to crash lldb by creating a function with a funny name.

There are plenty of other ways to implement this detail though, the most obvious one being assigning a surrogate ID (an increasing integer) to each module upon creation (just like we do with debuggers).

https://github.com/llvm/llvm-project/pull/148877


More information about the lldb-commits mailing list