[clang] [llvm] [DLCov 2/5] Implement DebugLoc coverage tracking (PR #107279)
Stephen Tozer via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 17 09:17:03 PDT 2025
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/107279
>From 55115af37710a80abbc2ec2ca5d96d1eea5827b4 Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Wed, 4 Sep 2024 12:23:52 +0100
Subject: [PATCH 1/4] Add conditionally-enabled DebugLocKinds
---
clang/lib/CodeGen/BackendUtil.cpp | 16 ++++
llvm/docs/HowToUpdateDebugInfo.rst | 36 +++++++++
llvm/include/llvm/IR/DebugLoc.h | 88 +++++++++++++++++++++-
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp | 5 ++
llvm/lib/IR/DebugInfo.cpp | 8 +-
llvm/lib/IR/DebugLoc.cpp | 6 ++
llvm/lib/Transforms/Utils/Debugify.cpp | 19 +++--
7 files changed, 167 insertions(+), 11 deletions(-)
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 7557cb8408921..20504b0e0868e 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -961,6 +961,22 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
Debugify.setOrigDIVerifyBugsReportFilePath(
CodeGenOpts.DIBugsReportFilePath);
Debugify.registerCallbacks(PIC, MAM);
+
+#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
+ // If we're using debug location coverage tracking, mark all the
+ // instructions coming out of the frontend without a DebugLoc as being
+ // compiler-generated, to prevent both those instructions and new
+ // instructions that inherit their location from being treated as
+ // incorrectly empty locations.
+ for (Function &F : *TheModule) {
+ if (!F.getSubprogram())
+ continue;
+ for (BasicBlock &BB : F)
+ for (Instruction &I : BB)
+ if (!I.getDebugLoc())
+ I.setDebugLoc(DebugLoc::getCompilerGenerated());
+ }
+#endif
}
// Attempt to load pass plugins and register their callbacks with PB.
for (auto &PluginFN : CodeGenOpts.PassPlugins) {
diff --git a/llvm/docs/HowToUpdateDebugInfo.rst b/llvm/docs/HowToUpdateDebugInfo.rst
index 3088f59c1066a..0e17813ea6534 100644
--- a/llvm/docs/HowToUpdateDebugInfo.rst
+++ b/llvm/docs/HowToUpdateDebugInfo.rst
@@ -169,6 +169,42 @@ See the discussion in the section about
:ref:`merging locations<WhenToMergeLocation>` for examples of when the rule for
dropping locations applies.
+.. _NewInstLocations:
+
+Setting locations for new instructions
+------------------------------------
+
+Whenever a new instruction is created and there is no suitable location for that
+instruction, that location should be annotated accordingly. There are a set of
+special ``DebugLoc`` values that can be used to indicate the reason that a new
+instruction does not have a valid location. These are as follows:
+
+- ``DebugLoc::getCompilerGenerated()``: This indicates that the instruction is a
+compiler-generated instruction, i.e. it is not associated with any user source
+code.
+- ``DebugLoc::getDropped()``: This indicates that the instruction has
+intentionally had its source location removed, according to the rules for
+dropping locations; this is set automatically by
+``Instruction::dropLocation()``.
+- ``DebugLoc::getUnknown()``: This indicates that the instruction does not have
+a known or currently knowable source location, e.g. the attribution is ambiguous
+in a way that can't currently be represented in LLVM, or that it is otherwise
+infeasible to determine or track the correct source location.
+- ``DebugLoc::getTemporary()``: This is used for instructions that we don't
+expect to be emitted (e.g. ``UnreachableInst``), and so should not need a valid
+location; if we ever try to emit a temporary location into an object file, this
+indicates that something has gone wrong.
+
+Where applicable, these should be used instead of leaving an instruction without
+an assigned location or explicitly setting the location as ``DebugLoc()``.
+Ordinarily these special locations are ignored by the compiler, but some testing
+builds will track their use in order to detect missing locations; for this
+reason, the most important rule is to *not* apply any of these if it isn't clear
+which, if any, is appropriate - an absent location can be detected and fixed,
+while an incorrectly annotated instruction is much harder to detect. On the
+other hand, if any of these clearly apply, then they should be used to prevent
+false positives from being flagged up.
+
Rules for updating debug values
===============================
diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h
index c22d3e9b10d27..fb205aa93a443 100644
--- a/llvm/include/llvm/IR/DebugLoc.h
+++ b/llvm/include/llvm/IR/DebugLoc.h
@@ -14,6 +14,7 @@
#ifndef LLVM_IR_DEBUGLOC_H
#define LLVM_IR_DEBUGLOC_H
+#include "llvm/Config/config.h"
#include "llvm/IR/TrackingMDRef.h"
#include "llvm/Support/DataTypes.h"
@@ -22,6 +23,73 @@ namespace llvm {
class LLVMContext;
class raw_ostream;
class DILocation;
+ class Function;
+
+#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
+ // Used to represent different "kinds" of DebugLoc, expressing that the
+ // instruction it is part of is either normal and should contain a valid
+ // DILocation, or otherwise describing the reason why the instruction does
+ // not contain a valid DILocation.
+ enum class DebugLocKind : uint8_t {
+ // The instruction is expected to contain a valid DILocation.
+ Normal,
+ // The instruction is compiler-generated, i.e. it is not associated with any
+ // line in the original source.
+ CompilerGenerated,
+ // The instruction has intentionally had its source location removed,
+ // typically because it was moved outside of its original control-flow and
+ // presenting the prior source location would be misleading for debuggers
+ // or profilers.
+ Dropped,
+ // The instruction does not have a known or currently knowable source
+ // location, e.g. the attribution is ambiguous in a way that can't be
+ // represented, or determining the correct location is complicated and
+ // requires future developer effort.
+ Unknown,
+ // DebugLoc is attached to an instruction that we don't expect to be
+ // emitted, and so can omit a valid DILocation; we don't expect to ever try
+ // and emit these into the line table, and trying to do so is a sign that
+ // something has gone wrong (most likely a DebugLoc leaking from a transient
+ // compiler-generated instruction).
+ Temporary
+ };
+
+ // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
+ // to ignore intentionally-empty DebugLocs.
+ class DILocAndCoverageTracking : public TrackingMDNodeRef {
+ public:
+ DebugLocKind Kind;
+ // Default constructor for empty DebugLocs.
+ DILocAndCoverageTracking()
+ : TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
+ // Valid or nullptr MDNode*, normal DebugLocKind.
+ DILocAndCoverageTracking(const MDNode *Loc)
+ : TrackingMDNodeRef(const_cast<MDNode *>(Loc)),
+ Kind(DebugLocKind::Normal) {}
+ DILocAndCoverageTracking(const DILocation *Loc);
+ // Explicit DebugLocKind, which always means a nullptr MDNode*.
+ DILocAndCoverageTracking(DebugLocKind Kind)
+ : TrackingMDNodeRef(nullptr), Kind(Kind) {}
+ };
+ template <> struct simplify_type<DILocAndCoverageTracking> {
+ using SimpleType = MDNode *;
+
+ static MDNode *getSimplifiedValue(DILocAndCoverageTracking &MD) {
+ return MD.get();
+ }
+ };
+ template <> struct simplify_type<const DILocAndCoverageTracking> {
+ using SimpleType = MDNode *;
+
+ static MDNode *getSimplifiedValue(const DILocAndCoverageTracking &MD) {
+ return MD.get();
+ }
+ };
+
+ using DebugLocTrackingRef = DILocAndCoverageTracking;
+#else
+ using DebugLocTrackingRef = TrackingMDNodeRef;
+#endif // ENABLE_DEBUGLOC_COVERAGE_TRACKING
/// A debug info location.
///
@@ -31,7 +99,8 @@ namespace llvm {
/// To avoid extra includes, \a DebugLoc doubles the \a DILocation API with a
/// one based on relatively opaque \a MDNode pointers.
class DebugLoc {
- TrackingMDNodeRef Loc;
+
+ DebugLocTrackingRef Loc;
public:
DebugLoc() = default;
@@ -47,6 +116,23 @@ namespace llvm {
/// IR.
explicit DebugLoc(const MDNode *N);
+#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
+ DebugLoc(DebugLocKind Kind) : Loc(Kind) {}
+ DebugLocKind getKind() const { return Loc.Kind; }
+#endif
+
+#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
+ static inline DebugLoc getTemporary() { return DebugLoc(DebugLocKind::Temporary); }
+ static inline DebugLoc getUnknown() { return DebugLoc(DebugLocKind::Unknown); }
+ static inline DebugLoc getCompilerGenerated() { return DebugLoc(DebugLocKind::CompilerGenerated); }
+ static inline DebugLoc getDropped() { return DebugLoc(DebugLocKind::Dropped); }
+#else
+ static inline DebugLoc getTemporary() { return DebugLoc(); }
+ static inline DebugLoc getUnknown() { return DebugLoc(); }
+ static inline DebugLoc getCompilerGenerated() { return DebugLoc(); }
+ static inline DebugLoc getDropped() { return DebugLoc(); }
+#endif // ENABLE_DEBUGLOC_COVERAGE_TRACKING
+
/// Get the underlying \a DILocation.
///
/// \pre !*this or \c isa<DILocation>(getAsMDNode()).
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index 39f1299a24e81..44aeb06780f1d 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -31,6 +31,7 @@
#include "llvm/CodeGen/TargetLowering.h"
#include "llvm/CodeGen/TargetRegisterInfo.h"
#include "llvm/CodeGen/TargetSubtargetInfo.h"
+#include "llvm/Config/config.h"
#include "llvm/DebugInfo/DWARF/DWARFDataExtractor.h"
#include "llvm/DebugInfo/DWARF/DWARFExpression.h"
#include "llvm/IR/Constants.h"
@@ -2097,6 +2098,10 @@ void DwarfDebug::beginInstruction(const MachineInstr *MI) {
}
if (!DL) {
+ // FIXME: We could assert that `DL.getKind() != DebugLocKind::Temporary`
+ // here, or otherwise record any temporary DebugLocs seen to ensure that
+ // transient compiler-generated instructions aren't leaking their DLs to
+ // other instructions.
// We have an unspecified location, which might want to be line 0.
// If we have already emitted a line-0 record, don't repeat it.
if (LastAsmLine == 0)
diff --git a/llvm/lib/IR/DebugInfo.cpp b/llvm/lib/IR/DebugInfo.cpp
index cefd6fe14a3ac..404569275e34f 100644
--- a/llvm/lib/IR/DebugInfo.cpp
+++ b/llvm/lib/IR/DebugInfo.cpp
@@ -987,8 +987,10 @@ void Instruction::updateLocationAfterHoist() { dropLocation(); }
void Instruction::dropLocation() {
const DebugLoc &DL = getDebugLoc();
- if (!DL)
+ if (!DL) {
+ setDebugLoc(DebugLoc::getDropped());
return;
+ }
// If this isn't a call, drop the location to allow a location from a
// preceding instruction to propagate.
@@ -1000,7 +1002,7 @@ void Instruction::dropLocation() {
}
if (!MayLowerToCall) {
- setDebugLoc(DebugLoc());
+ setDebugLoc(DebugLoc::getDropped());
return;
}
@@ -1019,7 +1021,7 @@ void Instruction::dropLocation() {
//
// One alternative is to set a line 0 location with the existing scope and
// inlinedAt info. The location might be sensitive to when inlining occurs.
- setDebugLoc(DebugLoc());
+ setDebugLoc(DebugLoc::getDropped());
}
//===----------------------------------------------------------------------===//
diff --git a/llvm/lib/IR/DebugLoc.cpp b/llvm/lib/IR/DebugLoc.cpp
index bdea52180f74a..b05d0f6096bda 100644
--- a/llvm/lib/IR/DebugLoc.cpp
+++ b/llvm/lib/IR/DebugLoc.cpp
@@ -11,6 +11,12 @@
#include "llvm/IR/DebugInfo.h"
using namespace llvm;
+#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
+DILocAndCoverageTracking::DILocAndCoverageTracking(const DILocation *L)
+ : TrackingMDNodeRef(const_cast<DILocation *>(L)),
+ Kind(DebugLocKind::Normal) {}
+#endif // ENABLE_DEBUGLOC_COVERAGE_TRACKING
+
//===----------------------------------------------------------------------===//
// DebugLoc Implementation
//===----------------------------------------------------------------------===//
diff --git a/llvm/lib/Transforms/Utils/Debugify.cpp b/llvm/lib/Transforms/Utils/Debugify.cpp
index e6b5e267d192b..e5a7415431f56 100644
--- a/llvm/lib/Transforms/Utils/Debugify.cpp
+++ b/llvm/lib/Transforms/Utils/Debugify.cpp
@@ -290,6 +290,16 @@ bool llvm::stripDebugifyMetadata(Module &M) {
return Changed;
}
+bool hasLoc(const Instruction &I) {
+ const DILocation *Loc = I.getDebugLoc().get();
+#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
+ DebugLocKind Kind = I.getDebugLoc().getKind();
+ return Loc || Kind != DebugLocKind::Normal;
+#else
+ return Loc;
+#endif
+}
+
bool llvm::collectDebugInfoMetadata(Module &M,
iterator_range<Module::iterator> Functions,
DebugInfoPerPass &DebugInfoBeforePass,
@@ -362,9 +372,7 @@ bool llvm::collectDebugInfoMetadata(Module &M,
LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n');
DebugInfoBeforePass.InstToDelete.insert({&I, &I});
- const DILocation *Loc = I.getDebugLoc().get();
- bool HasLoc = Loc != nullptr;
- DebugInfoBeforePass.DILocations.insert({&I, HasLoc});
+ DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
}
}
}
@@ -607,10 +615,7 @@ bool llvm::checkDebugInfoMetadata(Module &M,
LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n');
- const DILocation *Loc = I.getDebugLoc().get();
- bool HasLoc = Loc != nullptr;
-
- DebugInfoAfterPass.DILocations.insert({&I, HasLoc});
+ DebugInfoAfterPass.DILocations.insert({&I, hasLoc(I)});
}
}
}
>From 057bbad662e4bdc55847a776a0ac0298472ed892 Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Fri, 11 Apr 2025 11:20:18 +0100
Subject: [PATCH 2/4] clang-format
---
llvm/include/llvm/IR/DebugLoc.h | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h
index fb205aa93a443..9f1dafa8b71d9 100644
--- a/llvm/include/llvm/IR/DebugLoc.h
+++ b/llvm/include/llvm/IR/DebugLoc.h
@@ -122,10 +122,18 @@ namespace llvm {
#endif
#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
- static inline DebugLoc getTemporary() { return DebugLoc(DebugLocKind::Temporary); }
- static inline DebugLoc getUnknown() { return DebugLoc(DebugLocKind::Unknown); }
- static inline DebugLoc getCompilerGenerated() { return DebugLoc(DebugLocKind::CompilerGenerated); }
- static inline DebugLoc getDropped() { return DebugLoc(DebugLocKind::Dropped); }
+ static inline DebugLoc getTemporary() {
+ return DebugLoc(DebugLocKind::Temporary);
+ }
+ static inline DebugLoc getUnknown() {
+ return DebugLoc(DebugLocKind::Unknown);
+ }
+ static inline DebugLoc getCompilerGenerated() {
+ return DebugLoc(DebugLocKind::CompilerGenerated);
+ }
+ static inline DebugLoc getDropped() {
+ return DebugLoc(DebugLocKind::Dropped);
+ }
#else
static inline DebugLoc getTemporary() { return DebugLoc(); }
static inline DebugLoc getUnknown() { return DebugLoc(); }
>From 6489ef292310394a8ed80de64219a8cff6dc9ff8 Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Thu, 17 Apr 2025 16:49:43 +0100
Subject: [PATCH 3/4] Fix underline length
---
llvm/docs/HowToUpdateDebugInfo.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/llvm/docs/HowToUpdateDebugInfo.rst b/llvm/docs/HowToUpdateDebugInfo.rst
index 0e17813ea6534..0a77fdfd82612 100644
--- a/llvm/docs/HowToUpdateDebugInfo.rst
+++ b/llvm/docs/HowToUpdateDebugInfo.rst
@@ -172,7 +172,7 @@ dropping locations applies.
.. _NewInstLocations:
Setting locations for new instructions
-------------------------------------
+--------------------------------------
Whenever a new instruction is created and there is no suitable location for that
instruction, that location should be annotated accordingly. There are a set of
>From 7f87542f4351dcf80d40101015a63450c5b84613 Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Thu, 17 Apr 2025 17:16:47 +0100
Subject: [PATCH 4/4] Fix bullet list formatting
---
llvm/docs/HowToUpdateDebugInfo.rst | 33 ++++++++++++++++--------------
1 file changed, 18 insertions(+), 15 deletions(-)
diff --git a/llvm/docs/HowToUpdateDebugInfo.rst b/llvm/docs/HowToUpdateDebugInfo.rst
index 0a77fdfd82612..75a88193a1f85 100644
--- a/llvm/docs/HowToUpdateDebugInfo.rst
+++ b/llvm/docs/HowToUpdateDebugInfo.rst
@@ -179,21 +179,24 @@ instruction, that location should be annotated accordingly. There are a set of
special ``DebugLoc`` values that can be used to indicate the reason that a new
instruction does not have a valid location. These are as follows:
-- ``DebugLoc::getCompilerGenerated()``: This indicates that the instruction is a
-compiler-generated instruction, i.e. it is not associated with any user source
-code.
-- ``DebugLoc::getDropped()``: This indicates that the instruction has
-intentionally had its source location removed, according to the rules for
-dropping locations; this is set automatically by
-``Instruction::dropLocation()``.
-- ``DebugLoc::getUnknown()``: This indicates that the instruction does not have
-a known or currently knowable source location, e.g. the attribution is ambiguous
-in a way that can't currently be represented in LLVM, or that it is otherwise
-infeasible to determine or track the correct source location.
-- ``DebugLoc::getTemporary()``: This is used for instructions that we don't
-expect to be emitted (e.g. ``UnreachableInst``), and so should not need a valid
-location; if we ever try to emit a temporary location into an object file, this
-indicates that something has gone wrong.
+* ``DebugLoc::getCompilerGenerated()``: This indicates that the instruction is a
+ compiler-generated instruction, i.e. it is not associated with any user source
+ code.
+
+* ``DebugLoc::getDropped()``: This indicates that the instruction has
+ intentionally had its source location removed, according to the rules for
+ dropping locations; this is set automatically by
+ ``Instruction::dropLocation()``.
+
+* ``DebugLoc::getUnknown()``: This indicates that the instruction does not have
+ a known or currently knowable source location, e.g. the attribution is ambiguous
+ in a way that can't currently be represented in LLVM, or that it is otherwise
+ infeasible to determine or track the correct source location.
+
+* ``DebugLoc::getTemporary()``: This is used for instructions that we don't
+ expect to be emitted (e.g. ``UnreachableInst``), and so should not need a valid
+ location; if we ever try to emit a temporary location into an object file, this
+ indicates that something has gone wrong.
Where applicable, these should be used instead of leaving an instruction without
an assigned location or explicitly setting the location as ``DebugLoc()``.
More information about the llvm-commits
mailing list