[llvm] [RemoveDIs] Update all docs to use debug records (PR #91768)

Fri May 10 09:49:46 PDT 2024

llvmbot wrote:




@llvm/pr-subscribers-debuginfo

Author: Stephen Tozer (SLTozer)

<details>
<summary>Changes</summary>

Following the landing of the [patch](https://github.com/llvm/llvm-project/pull/91724) that enables printing debug records by default, the documentation should be updated to refer to debug records as the primary debug info representation, with debug intrinsics being relegated to an optional alternative.

This patch performs a few updates:

- Replace references to intrinsics with references to records across all the documentation.
- Replace intrinsics with records in code examples.
- Move debug records prior to debug intrinsics in the SourceLevelDebugging document, and change text to refer to them as the primary representation.
- Add release notes describing the change.

---

Patch is 62.16 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/91768.diff


9 Files Affected:

- (modified) llvm/docs/AssignmentTracking.md (+53-60) 
- (modified) llvm/docs/HowToUpdateDebugInfo.rst (+8-8) 
- (modified) llvm/docs/InstrRefDebugInfo.md (+1-1) 
- (modified) llvm/docs/LangRef.rst (+44-39) 
- (modified) llvm/docs/MIRLangRef.rst (+4-4) 
- (modified) llvm/docs/Passes.rst (+3-2) 
- (modified) llvm/docs/ReleaseNotes.rst (+4) 
- (modified) llvm/docs/SourceLevelDebugging.rst (+192-181) 
- (modified) llvm/docs/tutorial/MyFirstLanguageFrontend/LangImpl09.rst (+1-1) 


``````````diff

diff --git a/llvm/docs/AssignmentTracking.md b/llvm/docs/AssignmentTracking.md
index 5a8bc5844eef6..a24a8b0d797f8 100644
--- a/llvm/docs/AssignmentTracking.md
+++ b/llvm/docs/AssignmentTracking.md
@@ -11,7 +11,7 @@ The core idea is to track more information about source assignments in order
 and preserve enough information to be able to defer decisions about whether to
 use non-memory locations (register, constant) or memory locations until after
 middle end optimisations have run. This is in opposition to using
-`llvm.dbg.declare` and `llvm.dbg.value`, which is to make the decision for most
+`#dbg_declare` and `#dbg_value`, which is to make the decision for most
 variables early on, which can result in suboptimal variable locations that may
 be either incorrect or incomplete.
 
@@ -26,19 +26,18 @@ except for development and testing.
 **Enable in Clang**: `-Xclang -fexperimental-assignment-tracking`
 
 That causes Clang to get LLVM to run the pass `declare-to-assign`. The pass
-converts conventional debug intrinsics to assignment tracking metadata and sets
+converts conventional debug records to assignment tracking metadata and sets
 the module flag `debug-info-assignment-tracking` to the value `i1 true`. To
 check whether assignment tracking is enabled for a module call
 `isAssignmentTrackingEnabled(const Module &M)` (from `llvm/IR/DebugInfo.h`).
 
 ## Design and implementation
 
-### Assignment markers: `llvm.dbg.assign`
+### Assignment markers: `#dbg_assign`
 
-`llvm.dbg.value`, a conventional debug intrinsic, marks out a position in the
+`#dbg_value`, a conventional debug record, marks out a position in the
 IR where a variable takes a particular value. Similarly, Assignment Tracking
-marks out the position of assignments with a new intrinsic called
-`llvm.dbg.assign`.
+marks out the position of assignments with a record called `#dbg_assign`.
 
 In order to know where in IR it is appropriate to use a memory location for a
 variable, each assignment marker must in some way refer to the store, if any
@@ -48,24 +47,23 @@ important benefit of referring to the store is that we can then build a two-way
 mapping of stores<->markers that can be used to find markers that need to be
 updated when stores are modified.
 
-An `llvm.dbg.assign` marker that is not linked to any instruction signals that
+An `#dbg_assign` marker that is not linked to any instruction signals that
 the store that performed the assignment has been optimised out, and therefore
 the memory location will not be valid for at least some part of the program.
 
-Here's the `llvm.dbg.assign` signature. Each parameter is wrapped in
-`MetadataAsValue`, and `Value *` type parameters are first wrapped in
-`ValueAsMetadata`:
+Here's the `#dbg_assign` signature. `Value *` type parameters are first wrapped
+in `ValueAsMetadata`:
 
 ```
-void @llvm.dbg.assign(Value *Value,
-                      DIExpression *ValueExpression,
-                      DILocalVariable *Variable,
-                      DIAssignID *ID,
-                      Value *Address,
-                      DIExpression *AddressExpression)
+  #dbg_assign(Value *Value,
+              DIExpression *ValueExpression,
+              DILocalVariable *Variable,
+              DIAssignID *ID,
+              Value *Address,
+              DIExpression *AddressExpression)
 ```
 
-The first three parameters look and behave like an `llvm.dbg.value`. `ID` is a
+The first three parameters look and behave like an `#dbg_value`. `ID` is a
 reference to a store (see next section). `Address` is the destination address
 of the store and it is modified by `AddressExpression`. An empty/undef/poison
 address means the address component has been killed (the memory address is no
@@ -73,18 +71,13 @@ longer a valid location). LLVM currently encodes variable fragment information
 in `DIExpression`s, so as an implementation quirk the `FragmentInfo` for
 `Variable` is contained within `ValueExpression` only.
 
-The formal LLVM-IR signature is:
-```
-void @llvm.dbg.assign(metadata, metadata, metadata, metadata, metadata, metadata)
-```
-
 ### Instruction link: `DIAssignID`
 
 `DIAssignID` metadata is the mechanism that is currently used to encode the
 store<->marker link. The metadata node has no operands and all instances are
 `distinct`; equality is checked for by comparing addresses.
 
-`llvm.dbg.assign` intrinsics use a `DIAssignID` metadata node instance as an
+`#dbg_assign` records use a `DIAssignID` metadata node instance as an
 operand. This way it refers to any store-like instruction that has the same
 `DIAssignID` attachment. E.g. For this test.cpp,
 
@@ -102,9 +95,9 @@ we get:
 define dso_local noundef i32 @_Z3funi(i32 noundef %a) #0 !dbg !8 {
 entry:
   %a.addr = alloca i32, align 4, !DIAssignID !13
-  call void @llvm.dbg.assign(metadata i1 undef, metadata !14, metadata !DIExpression(), metadata !13, metadata i32* %a.addr, metadata !DIExpression()), !dbg !15
+    #dbg_assign(i1 undef, !14, !DIExpression(), !13, i32* %a.addr, !DIExpression(), !15)
   store i32 %a, i32* %a.addr, align 4, !DIAssignID !16
-  call void @llvm.dbg.assign(metadata i32 %a, metadata !14, metadata !DIExpression(), metadata !16, metadata i32* %a.addr, metadata !DIExpression()), !dbg !15
+    #dbg_assign(i32 %a, !14, !DIExpression(), !16, i32* %a.addr, !DIExpression(), !15)
   %0 = load i32, i32* %a.addr, align 4, !dbg !17
   ret i32 %0, !dbg !18
 }
@@ -116,16 +109,16 @@ entry:
 !16 = distinct !DIAssignID()
 ```
 
-The first `llvm.dbg.assign` refers to the `alloca` through `!DIAssignID !13`,
+The first `#dbg_assign` refers to the `alloca` through `!DIAssignID !13`,
 and the second refers to the `store` through `!DIAssignID !16`.
 
 ### Store-like instructions
 
-In the absence of a linked `llvm.dbg.assign`, a store to an address that is
+In the absence of a linked `#dbg_assign`, a store to an address that is
 known to be the backing storage for a variable is considered to represent an
 assignment to that variable.
 
-This gives us a safe fall-back in cases where `llvm.dbg.assign` intrinsics have
+This gives us a safe fall-back in cases where `#dbg_assign` records have
 been deleted, the `DIAssignID` attachment on the store has been dropped, or the
 optimiser has made a once-indirect store (not tracked with Assignment Tracking)
 direct.
@@ -139,61 +132,61 @@ direct.
 instruction. In this case, the assignment is considered to take place in
 multiple positions in the program.
 
-**Moving** a non-debug instruction: nothing new to do. Instructions linked to an
-`llvm.dbg.assign` have their initial IR position marked by the position of the
-`llvm.dbg.assign`.
+**Moving** a non-debug instruction: nothing new to do. Instructions linked to a
+`#dbg_assign` have their initial IR position marked by the position of the
+`#dbg_assign`.
 
 **Deleting** a non-debug instruction: nothing new to do. Simple DSE does not
 require any change; it’s safe to delete an instruction with a `DIAssignID`
-attachment. An `llvm.dbg.assign` that uses a `DIAssignID` that is not attached
+attachment. A `#dbg_assign` that uses a `DIAssignID` that is not attached
 to any instruction indicates that the memory location isn’t valid.
 
 **Merging** stores: In many cases no change is required as `DIAssignID`
 attachments are automatically merged if `combineMetadata` is called. One way or
 another, the `DIAssignID` attachments must be merged such that new store
-becomes linked to all the `llvm.dbg.assign` intrinsics that the merged stores
+becomes linked to all the `#dbg_assign` records that the merged stores
 were linked to. This can be achieved simply by calling a helper function
 `Instruction::mergeDIAssignID`.
 
-**Inlining** stores: As stores are inlined we generate `llvm.dbg.assign`
-intrinsics and `DIAssignID` attachments as if the stores represent source
+**Inlining** stores: As stores are inlined we generate `#dbg_assign`
+records and `DIAssignID` attachments as if the stores represent source
 assignments, just like the in frontend. This isn’t perfect, as stores may have
 been moved, modified or deleted before inlining, but it does at least keep the
 information about the variable correct within the non-inlined scope.
 
-**Splitting** stores: SROA and passes that split stores treat `llvm.dbg.assign`
-intrinsics similarly to `llvm.dbg.declare` intrinsics. Clone the
-`llvm.dbg.assign` intrinsics linked to the store, update the FragmentInfo in
-the `ValueExpression`, and give the split stores (and cloned intrinsics) new
+**Splitting** stores: SROA and passes that split stores treat `#dbg_assign`
+records similarly to `#dbg_declare` records. Clone the
+`#dbg_assign` records linked to the store, update the FragmentInfo in
+the `ValueExpression`, and give the split stores (and cloned records) new
 `DIAssignID` attachments each. In other words, treat the split stores as
 separate assignments. For partial DSE (e.g. shortening a memset), we do the
-same except that `llvm.dbg.assign` for the dead fragment gets an `Undef`
+same except that `#dbg_assign` for the dead fragment gets an `Undef`
 `Address`.
 
-**Promoting** allocas and store/loads: `llvm.dbg.assign` intrinsics implicitly
+**Promoting** allocas and store/loads: `#dbg_assign` records implicitly
 describe joined values in memory locations at CFG joins, but this is not
 necessarily the case after promoting (or partially promoting) the
 variable. Passes that promote variables are responsible for inserting
-`llvm.dbg.assign` intrinsics after the resultant PHIs generated during
-promotion. `mem2reg` already has to do this (with `llvm.dbg.value`) for
-`llvm.dbg.declare`s. Where a store has no linked intrinsic, the store is
+`#dbg_assign` records after the resultant PHIs generated during
+promotion. `mem2reg` already has to do this (with `#dbg_value`) for
+`#dbg_declare`s. Where a store has no linked record, the store is
 assumed to represent an assignment for variables stored at the destination
 address.
 
-#### Debug intrinsic updates
+#### Debug record updates
 
-**Moving** a debug intrinsic: avoid moving `llvm.dbg.assign` intrinsics where
+**Moving** a debug record: avoid moving `#dbg_assign` records where
 possible, as they represent a source-level assignment, whose position in the
 program should not be affected by optimization passes.
 
-**Deleting** a debug intrinsic: Nothing new to do. Just like for conventional
-debug intrinsics, unless it is unreachable, it’s almost always incorrect to
-delete a `llvm.dbg.assign` intrinsic.
+**Deleting** a debug record: Nothing new to do. Just like for conventional
+debug records, unless it is unreachable, it’s almost always incorrect to
+delete a `#dbg_assign` record.
 
-### Lowering `llvm.dbg.assign` to MIR
+### Lowering `#dbg_assign` to MIR
 
-To begin with only SelectionDAG ISel will be supported. `llvm.dbg.assign`
-intrinsics are lowered to MIR `DBG_INSTR_REF` instructions. Before this happens
+To begin with only SelectionDAG ISel will be supported. `#dbg_assign`
+records are lowered to MIR `DBG_INSTR_REF` instructions. Before this happens
 we need to decide where it is appropriate to use memory locations and where we
 must use a non-memory location (or no location) for each variable. In order to
 make those decisions we run a standard fixed-point dataflow analysis that makes
@@ -214,9 +207,9 @@ to tackle:
   clang/test/CodeGen/assignment-tracking/assignment-tracking.cpp for examples.
 
 * `trackAssignments` doesn't yet work for variables that have their
-  `llvm.dbg.declare` location modified by a `DIExpression`, e.g. when the
+  `#dbg_declare` location modified by a `DIExpression`, e.g. when the
   address of the variable is itself stored in an `alloca` with the
-  `llvm.dbg.declare` using `DIExpression(DW_OP_deref)`. See `indirectReturn` in
+  `#dbg_declare` using `DIExpression(DW_OP_deref)`. See `indirectReturn` in
   llvm/test/DebugInfo/Generic/assignment-tracking/track-assignments.ll and in
   clang/test/CodeGen/assignment-tracking/assignment-tracking.cpp for an
   example.
@@ -225,13 +218,13 @@ to tackle:
   memory location is available without using a `DIAssignID`. This is because
   the storage address is not computed by an instruction (it's an argument
   value) and therefore we have nowhere to put the metadata attachment. To solve
-  this we probably need another marker intrinsic to denote "the variable's
-  stack home is X address" - similar to `llvm.dbg.declare` except that it needs
-  to compose with `llvm.dbg.assign` intrinsics such that the stack home address
-  is only selected as a location for the variable when the `llvm.dbg.assign`
-  intrinsics agree it should be.
+  this we probably need another marker record to denote "the variable's
+  stack home is X address" - similar to `#dbg_declare` except that it needs
+  to compose with `#dbg_assign` records such that the stack home address
+  is only selected as a location for the variable when the `#dbg_assign`
+  records agree it should be.
 
-* Given the above (a special "the stack home is X" intrinsic), and the fact
+* Given the above (a special "the stack home is X" record), and the fact
   that we can only track assignments with fixed offsets and sizes, I think we
   can probably get rid of the address and address-expression part, since it
   will always be computable with the info we have.
diff --git a/llvm/docs/HowToUpdateDebugInfo.rst b/llvm/docs/HowToUpdateDebugInfo.rst
index c64b5d1d0d98b..0236e76c3a3e2 100644
--- a/llvm/docs/HowToUpdateDebugInfo.rst
+++ b/llvm/docs/HowToUpdateDebugInfo.rst
@@ -151,7 +151,7 @@ Deleting an IR-level Instruction
 
 When an ``Instruction`` is deleted, its debug uses change to ``undef``. This is
 a loss of debug info: the value of one or more source variables becomes
-unavailable, starting with the ``llvm.dbg.value(undef, ...)``. When there is no
+unavailable, starting with the ``#dbg_value(undef, ...)``. When there is no
 way to reconstitute the value of the lost instruction, this is the best
 possible outcome. However, it's often possible to do better:
 
@@ -172,7 +172,7 @@ possible outcome. However, it's often possible to do better:
   define i16 @foo(i16 %a) {
     %b = sext i16 %a to i32
     %c = and i32 %b, 15
-    call void @llvm.dbg.value(metadata i32 %c, ...)
+      #dbg_value(i32 %c, ...)
     %d = trunc i32 %c to i16
     ret i16 %d
   }
@@ -183,7 +183,7 @@ replaced with a simplified instruction:
 .. code-block:: llvm
 
   define i16 @foo(i16 %a) {
-    call void @llvm.dbg.value(metadata i32 undef, ...)
+      #dbg_value(i32 undef, ...)
     %simplified = and i16 %a, 15
     ret i16 %simplified
   }
@@ -204,7 +204,7 @@ This results in better debug info because the debug use of ``%c`` is preserved:
 
   define i16 @foo(i16 %a) {
     %simplified = and i16 %a, 15
-    call void @llvm.dbg.value(metadata i16 %simplified, ...)
+      #dbg_value(i16 %simplified, ...)
     ret i16 %simplified
   }
 
@@ -249,7 +249,7 @@ module, and the second checks that this DI is still available after an
 optimization has occurred, reporting any errors/warnings while doing so.
 
 The instructions are assigned sequentially increasing line locations, and are
-immediately used by debug value intrinsics everywhere possible.
+immediately used by debug value records everywhere possible.
 
 For example, here is a module before:
 
@@ -271,10 +271,10 @@ and after running ``opt -debugify``:
    define void @f(i32* %x) !dbg !6 {
    entry:
      %x.addr = alloca i32*, align 8, !dbg !12
-     call void @llvm.dbg.value(metadata i32** %x.addr, metadata !9, metadata !DIExpression()), !dbg !12
+       #dbg_value(i32** %x.addr, !9, !DIExpression(), !12)
      store i32* %x, i32** %x.addr, align 8, !dbg !13
      %0 = load i32*, i32** %x.addr, align 8, !dbg !14
-     call void @llvm.dbg.value(metadata i32* %0, metadata !11, metadata !DIExpression()), !dbg !14
+       #dbg_value(i32* %0, !11, !DIExpression(), !14)
      store i32 10, i32* %0, align 4, !dbg !15
      ret void, !dbg !16
    }
@@ -409,7 +409,7 @@ as follows:
   $ clang -Xclang -fverify-debuginfo-preserve -Xclang -fverify-debuginfo-preserve-export=sample.json -g -O2 sample.c
 
 Please do note that there are some known false positives, for source locations
-and debug intrinsic checking, so that will be addressed as a future work.
+and debug reecord checking, so that will be addressed as a future work.
 
 Mutation testing for MIR-level transformations
 ----------------------------------------------
diff --git a/llvm/docs/InstrRefDebugInfo.md b/llvm/docs/InstrRefDebugInfo.md
index 3917989e4026d..eb7a0464b90a0 100644
--- a/llvm/docs/InstrRefDebugInfo.md
+++ b/llvm/docs/InstrRefDebugInfo.md
@@ -24,7 +24,7 @@ referring to instruction values:
 
 ```llvm
 %2 = add i32 %0, %1
-call void @llvm.dbg.value(metadata i32 %2,
+  #dbg_value(metadata i32 %2,
 ```
 
 In LLVM IR, the IR Value is synonymous with the instruction that computes the
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 6f5a4644ffc2b..846be85c4390d 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -6244,11 +6244,11 @@ DIExpression
 """"""""""""
 
 ``DIExpression`` nodes represent expressions that are inspired by the DWARF
-expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>`
-(such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the
+expression language. They are used in :ref:`debug records<debug_records>`
+(such as ``#dbg_declare`` and ``#dbg_value``) to describe how the
 referenced LLVM variable relates to the source language variable. Debug
-intrinsics are interpreted left-to-right: start by pushing the value/address
-operand of the intrinsic onto a stack, then repeatedly push and evaluate
+expressions are interpreted left-to-right: start by pushing the value/address
+operand of the record onto a stack, then repeatedly push and evaluate
 opcodes from the DIExpression until the final variable description is produced.
 
 The current supported opcode vocabulary is limited:
@@ -6336,23 +6336,24 @@ The current supported opcode vocabulary is limited:
 
     IR for "*ptr = 4;"
     --------------
-    call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !20)
+      #dbg_value(i32 4, !17, !DIExpression(DW_OP_LLVM_implicit_pointer), !20)
     !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
                            type: !18)
     !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
     !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
-    !20 = !DIExpression(DW_OP_LLVM_implicit_pointer))
+    !20 = !DILocation(line: 10, scope: !12)
 
     IR for "**ptr = 4;"
     --------------
-    call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !21)
+      #dbg_value(i32 4, !17,
+        !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer),
+        !21)
     !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
                            type: !18)
     !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
     !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
     !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
-    !21 = !DIExpression(DW_OP_LLVM_implicit_pointer,
-                        DW_OP_LLVM_implicit_pointer))
+    !21 = !DILocation(line: 10, scope: !12)
 
 DWARF specifies three kinds of simple location descriptions: Register, memory,
 and implicit location descriptions.  Note that a location description is
@@ -6363,45 +6364,48 @@ sense that a debugger might modify its value), whereas *implicit locations*
 describe merely the actual *value* of a source variable which might not exist
 in registers or in memory (see ``DW_OP_stack_value``).
 
-A ``llvm.dbg.declare`` intrinsic describes an indirect value (the address) of a
-source variable. The first operand of the intrinsic must be an address of some
-kind. A DIExpression attached to the intrinsic refines this address to produce a
+A ``#dbg_declare`` record describes an indirect value (the address) of a
+source variable. The first operand of the record must be an address of some
+kind. A DIExpression operand to the record refines this address to produce a
 concrete location for the source variable.
 
-A ``llvm.dbg.value`` intrinsic describes the direct value of a source variable.
-The first operand of the intrinsic may be a direct or indirect value. A
-DIExpression attached to...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/91768