[llvm] [LangRef] Rework DIExpression docs (PR #153072)
Scott Linder via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 21 11:24:11 PDT 2025
https://github.com/slinder1 updated https://github.com/llvm/llvm-project/pull/153072
>From d55553257be5f3c67f7f0f63216fbe4d61c47a9f Mon Sep 17 00:00:00 2001
From: Scott Linder <Scott.Linder at amd.com>
Date: Mon, 11 Aug 2025 19:08:11 +0000
Subject: [PATCH 1/4] [LangRef] Rework DIExpression docs
Factor out most of the DIExpression docs from LangRef.rst into
SourceLevelDebugging.rst
What remains in LangRef is just enough context to make sense of how
DIExpression-as-metadata fits into the IR, including some examples of
the DIExpression syntax.
The rest now lives in the SourceLevelDebugging document, which gives
more context to make sense of DIExpression-as-semantic-entity.
Use sections to clearly separate DWARF opcodes from LLVM internal-only
opcodes, where before the distinction was only explicit in the source
code. This was the original reason for the patch, and if the rest isn't
accepted I can also just split up the list in LangRef instead.
Also make some other changes like fixing typos, using :ref: rather than
unchecked links.
---
llvm/docs/LangRef.rst | 150 +----------------
llvm/docs/SourceLevelDebugging.rst | 256 +++++++++++++++++++++++++----
2 files changed, 227 insertions(+), 179 deletions(-)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 6ba3759080cc3..8b9a939eda955 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -6748,161 +6748,23 @@ parameter, and it will be included in the ``retainedNodes:`` field of its
type: !3)
!2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)
-.. _DIExpression:
-
DIExpression
""""""""""""
``DIExpression`` nodes represent expressions that are inspired by the DWARF
-expression language. They are used in :ref:`debug records <debugrecords>`
-(such as ``#dbg_declare`` and ``#dbg_value``) to describe how the
-referenced LLVM variable relates to the source language variable. Debug
-expressions are interpreted left-to-right: start by pushing the value/address
-operand of the record onto a stack, then repeatedly push and evaluate
-opcodes from the DIExpression until the final variable description is produced.
-
-The current supported opcode vocabulary is limited:
-
-- ``DW_OP_deref`` dereferences the top of the expression stack.
-- ``DW_OP_plus`` pops the last two entries from the expression stack, adds
- them together and appends the result to the expression stack.
-- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
- the last entry from the second last entry and appends the result to the
- expression stack.
-- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
-- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
- here, respectively) of the variable fragment from the working expression. Note
- that contrary to DW_OP_bit_piece, the offset is describing the location
- within the described source variable.
-- ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
- (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
- expression stack is to be converted. Maps into a ``DW_OP_convert`` operation
- that references a base type constructed from the supplied values.
-- ``DW_OP_LLVM_extract_bits_sext, 16, 8,`` specifies the offset and size
- (``16`` and ``8`` here, respectively) of bits that are to be extracted and
- sign-extended from the value at the top of the expression stack. If the top of
- the expression stack is a memory location then these bits are extracted from
- the value pointed to by that memory location. Maps into a ``DW_OP_shl``
- followed by ``DW_OP_shra``.
-- ``DW_OP_LLVM_extract_bits_zext`` behaves similarly to
- ``DW_OP_LLVM_extract_bits_sext``, but zero-extends instead of sign-extending.
- Maps into a ``DW_OP_shl`` followed by ``DW_OP_shr``.
-- ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be
- optionally applied to the pointer. The memory tag is derived from the
- given tag offset in an implementation-defined manner.
-- ``DW_OP_swap`` swaps top two stack entries.
-- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
- of the stack is treated as an address. The second stack entry is treated as an
- address space identifier.
-- ``DW_OP_stack_value`` marks a constant value.
-- ``DW_OP_LLVM_entry_value, N`` refers to the value a register had upon
- function entry. When targeting DWARF, a ``DBG_VALUE(reg, ...,
- DIExpression(DW_OP_LLVM_entry_value, 1, ...)`` is lowered to
- ``DW_OP_entry_value [reg], ...``, which pushes the value ``reg`` had upon
- function entry onto the DWARF expression stack.
-
- The next ``(N - 1)`` operations will be part of the ``DW_OP_entry_value``
- block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value, 1,
- DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an expression where
- the entry value of ``reg`` is pushed onto the stack, and is added with 123.
- Due to framework limitations ``N`` must be 1, in other words,
- ``DW_OP_entry_value`` always refers to the value/address operand of the
- instruction.
-
- Because ``DW_OP_LLVM_entry_value`` is defined in terms of registers, it is
- usually used in MIR, but it is also allowed in LLVM IR when targeting a
- :ref:`swiftasync <swiftasync>` argument. The operation is introduced by:
-
- - ``LiveDebugValues`` pass, which applies it to function parameters that
- are unmodified throughout the function. Support is limited to simple
- register location descriptions, or as indirect locations (e.g.,
- parameters passed-by-value to a callee via a pointer to a temporary copy
- made in the caller).
- - ``AsmPrinter`` pass when a call site parameter value
- (``DW_AT_call_site_parameter_value``) is represented as entry value of
- the parameter.
- - ``CoroSplit`` pass, which may move variables from allocas into a
- coroutine frame. If the coroutine frame is a
- :ref:`swiftasync <swiftasync>` argument, the variable is described with
- an ``DW_OP_LLVM_entry_value`` operation.
-
-- ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one
- value, such as one that calculates the sum of two registers. This is always
- used in combination with an ordered list of values, such that
- ``DW_OP_LLVM_arg, N`` refers to the ``N``\ :sup:`th` element in that list. For
- example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus,
- DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to
- ``%reg1 - reg2``. This list of values should be provided by the containing
- intrinsic/instruction.
-- ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided
- signed offset of the specified register. The opcode is only generated by the
- ``AsmPrinter`` pass to describe call site parameter value which requires an
- expression over two registers.
-- ``DW_OP_push_object_address`` pushes the address of the object which can then
- serve as a descriptor in subsequent calculation. This opcode can be used to
- calculate bounds of fortran allocatable array which has array descriptors.
-- ``DW_OP_over`` duplicates the entry currently second in the stack at the top
- of the stack. This opcode can be used to calculate bounds of fortran assumed
- rank array which has rank known at run time and current dimension number is
- implicitly first element of the stack.
-- ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can
- be used to represent pointer variables which are optimized out but the value
- it points to is known. This operator is required as it is different than DWARF
- operator DW_OP_implicit_pointer in representation and specification (number
- and types of operands) and later can not be used as multiple level.
-
-.. code-block:: text
+expression language. They are used in :ref:`debug records <debug_records>`
+(such as ``#dbg_declare`` and ``#dbg_value``) to describe how the referenced
+LLVM variable relates to the source language variable.
- IR for "*ptr = 4;"
- --------------
- #dbg_value(i32 4, !17, !DIExpression(DW_OP_LLVM_implicit_pointer), !20)
- !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
- type: !18)
- !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
- !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
- !20 = !DILocation(line: 10, scope: !12)
-
- IR for "**ptr = 4;"
- --------------
- #dbg_value(i32 4, !17,
- !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer),
- !21)
- !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
- type: !18)
- !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
- !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
- !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
- !21 = !DILocation(line: 10, scope: !12)
-
-DWARF specifies three kinds of simple location descriptions: Register, memory,
-and implicit location descriptions. Note that a location description is
-defined over certain ranges of a program, i.e the location of a variable may
-change over the course of the program. Register and memory location
-descriptions describe the *concrete location* of a source variable (in the
-sense that a debugger might modify its value), whereas *implicit locations*
-describe merely the actual *value* of a source variable which might not exist
-in registers or in memory (see ``DW_OP_stack_value``).
-
-A ``#dbg_declare`` record describes an indirect value (the address) of a
-source variable. The first operand of the record must be an address of some
-kind. A DIExpression operand to the record refines this address to produce a
-concrete location for the source variable.
-
-A ``#dbg_value`` record describes the direct value of a source variable.
-The first operand of the record may be a direct or indirect value. A
-DIExpression operand to the record refines the first operand to produce a
-direct value. For example, if the first operand is an indirect value, it may be
-necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a
-valid debug record.
+See :ref:`diexpression` for details.
.. note::
- A DIExpression is interpreted in the same way regardless of which kind of
- debug record it's attached to.
-
DIExpressions are always printed and parsed inline; they can never be
referenced by an ID (e.g. ``!1``).
+Some examples of expressions:
+
.. code-block:: text
!DIExpression(DW_OP_deref)
diff --git a/llvm/docs/SourceLevelDebugging.rst b/llvm/docs/SourceLevelDebugging.rst
index ea27ee5b8fb1b..96d5de7341d21 100644
--- a/llvm/docs/SourceLevelDebugging.rst
+++ b/llvm/docs/SourceLevelDebugging.rst
@@ -160,15 +160,15 @@ which can have a value, including at least:
There is no special provision for "true" constants in LLVM today, and
they are instead treated as local or global variables.
-A variable is represented by a `local variable <LangRef.html#dilocalvariable>`_
-or `global variable <LangRef.html#diglobalvariable>`_ metadata node.
+A variable is represented by a :ref:`local variable <dilocalvariable>` or
+:ref:`global variable <diglobalvariable>` metadata node.
A "variable fragment" (or just "fragment") is a contiguous span of bits of a
variable.
-A :ref:`debug record <debug_records>` which refers to a ``DIExpression`` ending
-with a ``DW_OP_LLVM_fragment`` operation describes a fragment of the variable
-it refers to.
+A :ref:`debug record <debug_records>` which refers to a :ref:`diexpression`
+ending with a ``DW_OP_LLVM_fragment`` operation describes a fragment of the
+variable it refers to.
The operands of the ``DW_OP_LLVM_fragment`` operation encode the bit offset of
the fragment relative to the start of the variable, and the size of the
@@ -205,16 +205,16 @@ debugger to interpret the information.
To provide basic functionality, the LLVM debugger does have to make some
assumptions about the source-level language being debugged, though it keeps
these to a minimum. The only common features that the LLVM debugger assumes
-exist are `source files <LangRef.html#difile>`_, and `program objects
-<LangRef.html#diglobalvariable>`_. These abstract objects are used by a
-debugger to form stack traces, show information about local variables, etc.
+exist are :ref:`source files <difile>`, and :ref:`program objects
+<diglobalvariable>`. These abstract objects are used by a debugger to form
+stack traces, show information about local variables, etc.
This section of the documentation first describes the representation aspects
common to any source-language. :ref:`ccxx_frontend` describes the data layout
conventions used by the C and C++ front-ends.
-Debug information descriptors are `specialized metadata nodes
-<LangRef.html#specialized-metadata>`_, first-class subclasses of ``Metadata``.
+Debug information descriptors are :ref:`specialized metadata nodes
+<specialized-metadata>`, first-class subclasses of ``Metadata``.
There are two models for defining the values of source variables at different
states of the program and tracking these values through optimization and code
@@ -229,7 +229,7 @@ document.
.. _debug_records:
Debug Records
-----------------------------
+-------------
Debug records define the value that a source variable has during execution of
the program; they appear interleaved with instructions, although they are not
@@ -256,14 +256,13 @@ comma-separated arguments in parentheses, as with a `call`.
#dbg_declare([Value|MDNode], DILocalVariable, DIExpression, DILocation)
-This record provides information about a local element (e.g., variable).
-The first argument is an SSA ``ptr`` value corresponding to a variable address,
-and is typically a static alloca in the function entry block. The second
-argument is a `local variable <LangRef.html#dilocalvariable>`_ containing a
-description of the variable. The third argument is a `complex expression
-<LangRef.html#diexpression>`_. The fourth argument is a `source location
-<LangRef.html#dilocation>`_. A ``#dbg_declare`` record describes the
-*address* of a source variable.
+This record provides information about a local element (e.g., variable). The
+first argument is an SSA ``ptr`` value corresponding to a variable address, and
+is typically a static alloca in the function entry block. The second argument
+is a :ref:`local variable <dilocalvariable>` containing a description of the
+variable. The third argument is a :ref:`complex expression <diexpression>`.
+The fourth argument is a :ref:`source location <dilocation>`. A
+``#dbg_declare`` record describes the *address* of a source variable.
.. code-block:: llvm
@@ -299,11 +298,10 @@ must agree on the memory location.
#dbg_value([Value|DIArgList|MDNode], DILocalVariable, DIExpression, DILocation)
This record provides information when a user source variable is set to a new
-value. The first argument is the new value. The second argument is a `local
-variable <LangRef.html#dilocalvariable>`_ containing a description of the
-variable. The third argument is a `complex expression
-<LangRef.html#diexpression>`_. The fourth argument is a `source location
-<LangRef.html#dilocation>`_.
+value. The first argument is the new value. The second argument is a
+:ref:`local variable <dilocalvariable>` containing a description of the
+variable. The third argument is a :ref:`complex expression <diexpression>`.
+The fourth argument is a :ref:`source location <dilocation>`.
A ``#dbg_value`` record describes the *value* of a source variable
directly, not its address. Note that the value operand of this intrinsic may
@@ -311,7 +309,7 @@ be indirect (i.e, a pointer to the source variable), provided that interpreting
the complex expression derives the direct value.
``#dbg_assign``
-^^^^^^^^^^^^^^^^^^^
+^^^^^^^^^^^^^^^
.. toctree::
:hidden:
@@ -333,15 +331,21 @@ performs the assignment, and the destination address.
The first three arguments are the same as for a ``#dbg_value``. The fourth
argument is a ``DIAssignID`` used to reference a store. The fifth is the
-destination of the store, the sixth is a `complex
-expression <LangRef.html#diexpression>`_ that modifies it, and the seventh is a
-`source location <LangRef.html#dilocation>`_.
+destination of the store, the sixth is a :ref:`complex expression
+<diexpression>` that modifies it, and the seventh is a :ref:`source location
+<dilocation>`.
See :doc:`AssignmentTracking` for more info.
Debugger intrinsic functions
----------------------------
+.. warning::
+
+ These intrinsics are deprecated, please use :ref:`debug records
+ <debug_records>` instead. For more details see `RemoveDIs
+ <RemoveDIsDebugInfo.html>`_.
+
.. _format_common_intrinsics:
In intrinsic-mode, LLVM uses several intrinsic functions (name prefixed with "``llvm.dbg``") to
@@ -400,6 +404,189 @@ This intrinsic is equivalent to ``#dbg_assign``:
metadata i32 %i, metadata !1, metadata !DIExpression(), metadata !2,
metadata ptr %i.addr, metadata !DIExpression(), metadata !3), !dbg !3
+.. _diexpression:
+
+DIExpression
+------------
+
+Debug expressions are represented as :ref:`specialized-metadata`.
+
+Debug expressions are interpreted left-to-right: start by pushing the
+value/address operand of the record onto a stack, then repeatedly push and
+evaluate opcodes from the DIExpression until the final variable description is
+produced.
+
+The opcodes available in these expressions are described in
+:ref:`dwarf-opcodes` and :ref:`internal-opcodes`.
+
+DWARF specifies three kinds of simple location descriptions: Register, memory,
+and implicit location descriptions. Note that a location description is
+defined over certain ranges of a program, i.e the location of a variable may
+change over the course of the program. Register and memory location
+descriptions describe the *concrete location* of a source variable (in the
+sense that a debugger might modify its value), whereas *implicit locations*
+describe merely the actual *value* of a source variable which might not exist
+in registers or in memory (see ``DW_OP_stack_value``).
+
+A ``#dbg_declare`` record describes an indirect value (the address) of a
+source variable. The first operand of the record must be an address of some
+kind. A DIExpression operand to the record refines this address to produce a
+concrete location for the source variable.
+
+A ``#dbg_value`` record describes the direct value of a source variable.
+The first operand of the record may be a direct or indirect value. A
+DIExpression operand to the record refines the first operand to produce a
+direct value. For example, if the first operand is an indirect value, it may be
+necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a
+valid debug record.
+
+.. note::
+
+ A DIExpression is interpreted in the same way regardless of which kind of
+ debug record it's attached to.
+
+ DIExpressions are always printed and parsed inline; they can never be
+ referenced by an ID (e.g. ``!1``).
+
+Examples using ``DW_OP_LLVM_implicit_pointer``:
+
+.. code-block:: text
+
+ IR for "*ptr = 4;"
+ --------------
+ #dbg_value(i32 4, !17, !DIExpression(DW_OP_LLVM_implicit_pointer), !20)
+ !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
+ type: !18)
+ !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
+ !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+ !20 = !DILocation(line: 10, scope: !12)
+
+ IR for "**ptr = 4;"
+ --------------
+ #dbg_value(i32 4, !17,
+ !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer),
+ !21)
+ !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
+ type: !18)
+ !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
+ !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
+ !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+ !21 = !DILocation(line: 10, scope: !12)
+
+
+.. _dwarf-opcodes:
+
+DWARF Opcodes
+^^^^^^^^^^^^^
+
+When possible LLVM reuses DWARF opcodes and gives them identical semantics in
+LLVM expressions as in DWARF expressions. The current supported opcode
+vocabulary is limited, but includes at least:
+
+- ``DW_OP_deref`` dereferences the top of the expression stack.
+- ``DW_OP_plus`` pops the last two entries from the expression stack, adds
+ them together and appends the result to the expression stack.
+- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
+ the last entry from the second last entry and appends the result to the
+ expression stack.
+- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
+- ``DW_OP_swap`` swaps top two stack entries.
+- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
+ of the stack is treated as an address. The second stack entry is treated as an
+ address space identifier.
+- ``DW_OP_stack_value`` marks a constant value.
+- ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided
+ signed offset of the specified register. The opcode is only generated by the
+ ``AsmPrinter`` pass to describe call site parameter value which requires an
+ expression over two registers.
+- ``DW_OP_push_object_address`` pushes the address of the object which can then
+ serve as a descriptor in subsequent calculation. This opcode can be used to
+ calculate bounds of fortran allocatable array which has array descriptors.
+- ``DW_OP_over`` duplicates the entry currently second in the stack at the top
+ of the stack. This opcode can be used to calculate bounds of fortran assumed
+ rank array which has rank known at run time and current dimension number is
+ implicitly first element of the stack.
+
+.. _internal-opcodes:
+
+Internal Opcodes
+^^^^^^^^^^^^^^^^
+
+Where the DWARF equivalent is not suitable, or no DWARF equivalent exists, LLVM
+defines internal-only opcodes which have no direct analog in DWARF.
+
+.. note::
+
+ Some opcodes do not influence the final DWARF expression directly, instead
+ encoding information logically belonging to the debug records which use
+ them.
+
+- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and
+ ``8`` here, respectively) of the variable fragment from the working
+ expression. Note that contrary to DW_OP_bit_piece, the offset is describing
+ the location within the described source variable. This does not affect the
+ semantics of the expression.
+- ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
+ (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
+ expression stack is to be converted. Maps into a ``DW_OP_convert`` operation
+ that references a base type constructed from the supplied values.
+- ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be
+ optionally applied to the pointer. The memory tag is derived from the
+ given tag offset in an implementation-defined manner. This does not affect
+ the semantics of the expression.
+- ``DW_OP_LLVM_entry_value, N`` refers to the value a register had upon
+ function entry. When targeting DWARF, a ``DBG_VALUE(reg, ...,
+ DIExpression(DW_OP_LLVM_entry_value, 1, ...)`` is lowered to
+ ``DW_OP_entry_value [reg], ...``, which pushes the value ``reg`` had upon
+ function entry onto the DWARF expression stack.
+
+ The next ``(N - 1)`` operations will be part of the ``DW_OP_entry_value``
+ block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value, 1,
+ DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an expression where
+ the entry value of ``reg`` is pushed onto the stack, and is added with 123.
+ Due to framework limitations ``N`` must be 1, in other words,
+ ``DW_OP_entry_value`` always refers to the value/address operand of the
+ instruction.
+
+ Because ``DW_OP_LLVM_entry_value`` is defined in terms of registers, it is
+ usually used in MIR, but it is also allowed in LLVM IR when targeting a
+ :ref:`swiftasync <swiftasync>` argument. The operation is introduced by:
+
+ - ``LiveDebugValues`` pass, which applies it to function parameters that
+ are unmodified throughout the function. Support is limited to simple
+ register location descriptions, or as indirect locations (e.g.,
+ parameters passed-by-value to a callee via a pointer to a temporary copy
+ made in the caller).
+ - ``AsmPrinter`` pass when a call site parameter value
+ (``DW_AT_call_site_parameter_value``) is represented as entry value of
+ the parameter.
+ - ``CoroSplit`` pass, which may move variables from allocas into a
+ coroutine frame. If the coroutine frame is a
+ :ref:`swiftasync <swiftasync>` argument, the variable is described with
+ an ``DW_OP_LLVM_entry_value`` operation.
+
+- ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can
+ be used to represent pointer variables which are optimized out but the value
+ it points to is known. This operator is required as it is different than DWARF
+ operator DW_OP_implicit_pointer in representation and specification (number
+ and types of operands) and later can not be used as multiple level.
+- ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one
+ value, such as one that calculates the sum of two registers. This is always
+ used in combination with an ordered list of values, such that
+ ``DW_OP_LLVM_arg, N`` refers to the ``N``\ :sup:`th` element in that list. For
+ example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus,
+ DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to
+ ``%reg1 - reg2``. This list of values should be provided by the containing
+ intrinsic/instruction.
+- ``DW_OP_LLVM_extract_bits_sext, 16, 8,`` specifies the offset and size
+ (``16`` and ``8`` here, respectively) of bits that are to be extracted and
+ sign-extended from the value at the top of the expression stack. If the top of
+ the expression stack is a memory location then these bits are extracted from
+ the value pointed to by that memory location. Maps into a ``DW_OP_shl``
+ followed by ``DW_OP_shra``.
+- ``DW_OP_LLVM_extract_bits_zext`` behaves similarly to
+ ``DW_OP_LLVM_extract_bits_sext``, but zero-extends instead of sign-extending.
+ Maps into a ``DW_OP_shl`` followed by ``DW_OP_shr``.
Object lifetimes and scoping
============================
@@ -506,11 +693,11 @@ scope information for the variable ``X``.
isLocal: false, isDefinition: true, scopeLine: 1,
isOptimized: false, retainedNodes: !2)
-Here ``!13`` is metadata providing `location information
-<LangRef.html#dilocation>`_. In this example, scope is encoded by ``!4``, a
-`subprogram descriptor <LangRef.html#disubprogram>`_. This way the location
-information parameter to the records indicates that the variable ``X`` is
-declared at line number 2 at a function level scope in function ``foo``.
+Here ``!13`` is metadata providing :ref:`location information <dilocation>`.
+In this example, scope is encoded by ``!4``, a :ref:`subprogram descriptor
+<disubprogram>`. This way the location information parameter to the records
+indicates that the variable ``X`` is declared at line number 2 at a function
+level scope in function ``foo``.
Now, let's take another example.
@@ -782,8 +969,7 @@ And has the following operands:
location operands, which may take any of the same values as the first
operand of the ``DBG_VALUE`` instruction above. These variable location
operands are inserted into the final DWARF Expression in positions indicated
- by the ``DW_OP_LLVM_arg`` operator in the `DIExpression
- <LangRef.html#diexpression>`_.
+ by the ``DW_OP_LLVM_arg`` operator in the :ref:`diexpression`.
The position at which the DBG_VALUEs are inserted should correspond to the
positions of their matching ``#dbg_value`` records in the IR block. As
>From 6546ce7a1feeeb0237f199b6551f041ebc9f2855 Mon Sep 17 00:00:00 2001
From: Scott Linder <Scott.Linder at amd.com>
Date: Wed, 13 Aug 2025 22:46:22 +0000
Subject: [PATCH 2/4] Address review feedback
---
llvm/docs/SourceLevelDebugging.rst | 93 ++++++++++++++++--------------
1 file changed, 51 insertions(+), 42 deletions(-)
diff --git a/llvm/docs/SourceLevelDebugging.rst b/llvm/docs/SourceLevelDebugging.rst
index 96d5de7341d21..bcd64fda00c8b 100644
--- a/llvm/docs/SourceLevelDebugging.rst
+++ b/llvm/docs/SourceLevelDebugging.rst
@@ -398,7 +398,7 @@ This intrinsic is equivalent to ``#dbg_assign``:
.. code-block:: llvm
- #dbg_assign(i32 %i, !1, !DIExpression(), !2,
+ #dbg_assign(i32 %i, !1, !DIExpression(), !2,
ptr %i.addr, !DIExpression(), !3)
call void @llvm.dbg.assign(
metadata i32 %i, metadata !1, metadata !DIExpression(), metadata !2,
@@ -448,32 +448,6 @@ valid debug record.
DIExpressions are always printed and parsed inline; they can never be
referenced by an ID (e.g. ``!1``).
-Examples using ``DW_OP_LLVM_implicit_pointer``:
-
-.. code-block:: text
-
- IR for "*ptr = 4;"
- --------------
- #dbg_value(i32 4, !17, !DIExpression(DW_OP_LLVM_implicit_pointer), !20)
- !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
- type: !18)
- !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
- !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
- !20 = !DILocation(line: 10, scope: !12)
-
- IR for "**ptr = 4;"
- --------------
- #dbg_value(i32 4, !17,
- !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer),
- !21)
- !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
- type: !18)
- !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
- !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
- !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
- !21 = !DILocation(line: 10, scope: !12)
-
-
.. _dwarf-opcodes:
DWARF Opcodes
@@ -485,16 +459,20 @@ vocabulary is limited, but includes at least:
- ``DW_OP_deref`` dereferences the top of the expression stack.
- ``DW_OP_plus`` pops the last two entries from the expression stack, adds
- them together and appends the result to the expression stack.
+ them together and pushes the result to the expression stack.
- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
the last entry from the second last entry and appends the result to the
expression stack.
-- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
+- ``DW_OP_plus_uconst, 93`` adds ``93`` to the value on top of the stack.
- ``DW_OP_swap`` swaps top two stack entries.
- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
of the stack is treated as an address. The second stack entry is treated as an
- address space identifier.
-- ``DW_OP_stack_value`` marks a constant value.
+ address space identifier. The two entries are popped and then an
+ implementation defined value is pushed on the stack.
+- ``DW_OP_stack_value`` may appear at most once in an expression, and must be
+ the last opcode if ``DW_OP_LLVM_fragment`` is not present, or the second last
+ opcode if ``DW_OP_LLVM_fragment`` is present. It pops the top value of the
+ expression stack and makes an implicit value location with that value.
- ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided
signed offset of the specified register. The opcode is only generated by the
``AsmPrinter`` pass to describe call site parameter value which requires an
@@ -521,19 +499,23 @@ defines internal-only opcodes which have no direct analog in DWARF.
encoding information logically belonging to the debug records which use
them.
-- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and
- ``8`` here, respectively) of the variable fragment from the working
- expression. Note that contrary to DW_OP_bit_piece, the offset is describing
- the location within the described source variable. This does not affect the
- semantics of the expression.
+- ``DW_OP_LLVM_fragment, <offset>, <size>`` may appear at most once in an
+ expression, and must be the last opcode. It specifies the bit offset and bit
+ size of the variable fragment being described by the record or intrinsic
+ using the expression. Note that contrary to DW_OP_bit_piece, the offset is
+ describing the location within the described source variable. At DWARF
+ generation time all fragments for the same variable are collected together
+ and DWARF DW_OP_piece and DW_OP_bit_piece opcodes are used to describe a
+ composite with pieces corresponding to the fragments. (This does not affect
+ the semantics of the expression containing it.)
- ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
(``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
expression stack is to be converted. Maps into a ``DW_OP_convert`` operation
that references a base type constructed from the supplied values.
- ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be
- optionally applied to the pointer. The memory tag is derived from the
- given tag offset in an implementation-defined manner. This does not affect
- the semantics of the expression.
+ optionally applied to the pointer. The memory tag is derived from the given
+ tag offset in an implementation-defined manner. (This does not affect the
+ semantics of the expression containing it.)
- ``DW_OP_LLVM_entry_value, N`` refers to the value a register had upon
function entry. When targeting DWARF, a ``DBG_VALUE(reg, ...,
DIExpression(DW_OP_LLVM_entry_value, 1, ...)`` is lowered to
@@ -570,12 +552,39 @@ defines internal-only opcodes which have no direct analog in DWARF.
it points to is known. This operator is required as it is different than DWARF
operator DW_OP_implicit_pointer in representation and specification (number
and types of operands) and later can not be used as multiple level.
+
+ Examples using ``DW_OP_LLVM_implicit_pointer``:
+
+ .. code-block:: text
+
+ IR for "*ptr = 4;"
+ --------------
+ #dbg_value(i32 4, !17, !DIExpression(DW_OP_LLVM_implicit_pointer), !20)
+ !17 = !DILocalVariable(name: "ptr", scope: !12, file: !3, line: 5,
+ type: !18)
+ !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
+ !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+ !20 = !DILocation(line: 10, scope: !12)
+
+ IR for "**ptr = 4;"
+ --------------
+ #dbg_value(i32 4, !17,
+ !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer),
+ !21)
+ !17 = !DILocalVariable(name: "ptr", scope: !12, file: !3, line: 5,
+ type: !18)
+ !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
+ !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
+ !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+ !21 = !DILocation(line: 10, scope: !12)
+
- ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one
value, such as one that calculates the sum of two registers. This is always
used in combination with an ordered list of values, such that
- ``DW_OP_LLVM_arg, N`` refers to the ``N``\ :sup:`th` element in that list. For
- example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus,
- DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to
+ ``DW_OP_LLVM_arg, N`` refers to the ``N``\ :sup:`th` element in that list.
+ For example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1,
+ DW_OP_minus, DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would
+ evaluate to an implicit value location that has the value of
``%reg1 - reg2``. This list of values should be provided by the containing
intrinsic/instruction.
- ``DW_OP_LLVM_extract_bits_sext, 16, 8,`` specifies the offset and size
>From b86562a50dccc19a8fc2bcbc0923a88edc2651bd Mon Sep 17 00:00:00 2001
From: Scott Linder <Scott.Linder at amd.com>
Date: Tue, 19 Aug 2025 22:54:27 +0000
Subject: [PATCH 3/4] Try a new reword of DW_OP_LLVM_entry_value
---
llvm/docs/SourceLevelDebugging.rst | 51 +++++++++++++++++++++++-------
1 file changed, 39 insertions(+), 12 deletions(-)
diff --git a/llvm/docs/SourceLevelDebugging.rst b/llvm/docs/SourceLevelDebugging.rst
index bcd64fda00c8b..8941ced74c91e 100644
--- a/llvm/docs/SourceLevelDebugging.rst
+++ b/llvm/docs/SourceLevelDebugging.rst
@@ -516,21 +516,48 @@ defines internal-only opcodes which have no direct analog in DWARF.
optionally applied to the pointer. The memory tag is derived from the given
tag offset in an implementation-defined manner. (This does not affect the
semantics of the expression containing it.)
-- ``DW_OP_LLVM_entry_value, N`` refers to the value a register had upon
- function entry. When targeting DWARF, a ``DBG_VALUE(reg, ...,
+- ``DW_OP_LLVM_entry_value, N`` evaluates a sub-expression as-if it were
+ evaluated upon entry to the current call frame.
+
+ The sub-expression replaces the operations which comprise it, i.e. all such
+ operations are evaluated only in the frame entry context.
+
+ The sub-expression begins with the operation which immediately precedes
+ ``DW_OP_LLVM_entry_value, N`` in the ``DIExpression``. If no such operation
+ exists (i.e. the expression begins with ``DW_OP_LLVM_entry_value, N``), the
+ implicit operation which pushes the first debug argument of the containing
+ marker/pseudo is used instead. The value ``N`` must always be at least ``1``,
+ as this first operation cannot be omitted and is counted in ``N``.
+
+ The rest of the sub-expression comprises the ``(N - 1)`` operations following
+ ``DW_OP_LLVM_entry_value, N`` in the ``DIExpression``.
+
+ Due to framework limitations:
+
+ - ``N`` must not be greater than ``1``. In other words, ``N`` must equal
+ ``1``, and the sub-expression comprises only the operation immediately
+ preceding ``DW_OP_LLVM_entry_value, N``.
+ - ``DW_OP_LLVM_entry_value, N`` must be either the first operation of a
+ ``DIExpression`` or the second operation if the expression begins with
+ ``DW_OP_LLVM_arg, 0``.
+ - The first operation must refer to a register value.
+
+ Taken together, these limitations mean that ``DW_OP_LLVM_entry_value`` can
+ only currently be used to push the value a single register had on entry to
+ the current stack frame.
+
+ For example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_entry_value, 1,
+ DW_OP_LLVM_arg, 1, DW_OP_plus, DW_OP_stack_value)`` specifies an expression
+ where the entry value of the first argument to the ``DIExpression`` is added
+ to the non-entry value of the second argument, and the result is used as the
+ value for an implicit value location.
+
+ When targeting DWARF, a ``DBG_VALUE(reg, ...,
DIExpression(DW_OP_LLVM_entry_value, 1, ...)`` is lowered to
``DW_OP_entry_value [reg], ...``, which pushes the value ``reg`` had upon
- function entry onto the DWARF expression stack.
+ frame entry onto the DWARF expression stack.
- The next ``(N - 1)`` operations will be part of the ``DW_OP_entry_value``
- block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value, 1,
- DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an expression where
- the entry value of ``reg`` is pushed onto the stack, and is added with 123.
- Due to framework limitations ``N`` must be 1, in other words,
- ``DW_OP_entry_value`` always refers to the value/address operand of the
- instruction.
-
- Because ``DW_OP_LLVM_entry_value`` is defined in terms of registers, it is
+ Because ``DW_OP_LLVM_entry_value`` is currently limited registers, it is
usually used in MIR, but it is also allowed in LLVM IR when targeting a
:ref:`swiftasync <swiftasync>` argument. The operation is introduced by:
>From c00dce53dbad3ac850e8b4528287afe1e7f46bd7 Mon Sep 17 00:00:00 2001
From: Scott Linder <Scott.Linder at amd.com>
Date: Thu, 21 Aug 2025 18:23:53 +0000
Subject: [PATCH 4/4] fix typo
---
llvm/docs/SourceLevelDebugging.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/llvm/docs/SourceLevelDebugging.rst b/llvm/docs/SourceLevelDebugging.rst
index 8941ced74c91e..8145ba0b9eec1 100644
--- a/llvm/docs/SourceLevelDebugging.rst
+++ b/llvm/docs/SourceLevelDebugging.rst
@@ -557,7 +557,7 @@ defines internal-only opcodes which have no direct analog in DWARF.
``DW_OP_entry_value [reg], ...``, which pushes the value ``reg`` had upon
frame entry onto the DWARF expression stack.
- Because ``DW_OP_LLVM_entry_value`` is currently limited registers, it is
+ Because ``DW_OP_LLVM_entry_value`` is currently limited to registers, it is
usually used in MIR, but it is also allowed in LLVM IR when targeting a
:ref:`swiftasync <swiftasync>` argument. The operation is introduced by:
More information about the llvm-commits
mailing list