[llvm] [LangRef] Rework DIExpression docs (PR #153072)
Tony Tye via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 20 17:54:53 PDT 2025
================
@@ -394,12 +398,231 @@ This intrinsic is equivalent to ``#dbg_assign``:
.. code-block:: llvm
- #dbg_assign(i32 %i, !1, !DIExpression(), !2,
+ #dbg_assign(i32 %i, !1, !DIExpression(), !2,
ptr %i.addr, !DIExpression(), !3)
call void @llvm.dbg.assign(
metadata i32 %i, metadata !1, metadata !DIExpression(), metadata !2,
metadata ptr %i.addr, metadata !DIExpression(), metadata !3), !dbg !3
+.. _diexpression:
+
+DIExpression
+------------
+
+Debug expressions are represented as :ref:`specialized-metadata`.
+
+Debug expressions are interpreted left-to-right: start by pushing the
+value/address operand of the record onto a stack, then repeatedly push and
+evaluate opcodes from the DIExpression until the final variable description is
+produced.
+
+The opcodes available in these expressions are described in
+:ref:`dwarf-opcodes` and :ref:`internal-opcodes`.
+
+DWARF specifies three kinds of simple location descriptions: Register, memory,
+and implicit location descriptions. Note that a location description is
+defined over certain ranges of a program, i.e the location of a variable may
+change over the course of the program. Register and memory location
+descriptions describe the *concrete location* of a source variable (in the
+sense that a debugger might modify its value), whereas *implicit locations*
+describe merely the actual *value* of a source variable which might not exist
+in registers or in memory (see ``DW_OP_stack_value``).
+
+A ``#dbg_declare`` record describes an indirect value (the address) of a
+source variable. The first operand of the record must be an address of some
+kind. A DIExpression operand to the record refines this address to produce a
+concrete location for the source variable.
+
+A ``#dbg_value`` record describes the direct value of a source variable.
+The first operand of the record may be a direct or indirect value. A
+DIExpression operand to the record refines the first operand to produce a
+direct value. For example, if the first operand is an indirect value, it may be
+necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a
+valid debug record.
+
+.. note::
+
+ A DIExpression is interpreted in the same way regardless of which kind of
+ debug record it's attached to.
+
+ DIExpressions are always printed and parsed inline; they can never be
+ referenced by an ID (e.g. ``!1``).
+
+.. _dwarf-opcodes:
+
+DWARF Opcodes
+^^^^^^^^^^^^^
+
+When possible LLVM reuses DWARF opcodes and gives them identical semantics in
+LLVM expressions as in DWARF expressions. The current supported opcode
+vocabulary is limited, but includes at least:
+
+- ``DW_OP_deref`` dereferences the top of the expression stack.
+- ``DW_OP_plus`` pops the last two entries from the expression stack, adds
+ them together and pushes the result to the expression stack.
+- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
+ the last entry from the second last entry and appends the result to the
+ expression stack.
+- ``DW_OP_plus_uconst, 93`` adds ``93`` to the value on top of the stack.
+- ``DW_OP_swap`` swaps top two stack entries.
+- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
+ of the stack is treated as an address. The second stack entry is treated as an
+ address space identifier. The two entries are popped and then an
+ implementation defined value is pushed on the stack.
+- ``DW_OP_stack_value`` may appear at most once in an expression, and must be
+ the last opcode if ``DW_OP_LLVM_fragment`` is not present, or the second last
+ opcode if ``DW_OP_LLVM_fragment`` is present. It pops the top value of the
+ expression stack and makes an implicit value location with that value.
+- ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided
+ signed offset of the specified register. The opcode is only generated by the
+ ``AsmPrinter`` pass to describe call site parameter value which requires an
+ expression over two registers.
+- ``DW_OP_push_object_address`` pushes the address of the object which can then
+ serve as a descriptor in subsequent calculation. This opcode can be used to
+ calculate bounds of fortran allocatable array which has array descriptors.
+- ``DW_OP_over`` duplicates the entry currently second in the stack at the top
+ of the stack. This opcode can be used to calculate bounds of fortran assumed
+ rank array which has rank known at run time and current dimension number is
+ implicitly first element of the stack.
+
+.. _internal-opcodes:
+
+Internal Opcodes
+^^^^^^^^^^^^^^^^
+
+Where the DWARF equivalent is not suitable, or no DWARF equivalent exists, LLVM
+defines internal-only opcodes which have no direct analog in DWARF.
+
+.. note::
+
+ Some opcodes do not influence the final DWARF expression directly, instead
+ encoding information logically belonging to the debug records which use
+ them.
+
+- ``DW_OP_LLVM_fragment, <offset>, <size>`` may appear at most once in an
+ expression, and must be the last opcode. It specifies the bit offset and bit
+ size of the variable fragment being described by the record or intrinsic
+ using the expression. Note that contrary to DW_OP_bit_piece, the offset is
+ describing the location within the described source variable. At DWARF
+ generation time all fragments for the same variable are collected together
+ and DWARF DW_OP_piece and DW_OP_bit_piece opcodes are used to describe a
+ composite with pieces corresponding to the fragments. (This does not affect
+ the semantics of the expression containing it.)
+- ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
+ (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
+ expression stack is to be converted. Maps into a ``DW_OP_convert`` operation
+ that references a base type constructed from the supplied values.
+- ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be
+ optionally applied to the pointer. The memory tag is derived from the given
+ tag offset in an implementation-defined manner. (This does not affect the
+ semantics of the expression containing it.)
+- ``DW_OP_LLVM_entry_value, N`` evaluates a sub-expression as-if it were
+ evaluated upon entry to the current call frame.
+
+ The sub-expression replaces the operations which comprise it, i.e. all such
+ operations are evaluated only in the frame entry context.
+
+ The sub-expression begins with the operation which immediately precedes
+ ``DW_OP_LLVM_entry_value, N`` in the ``DIExpression``. If no such operation
+ exists (i.e. the expression begins with ``DW_OP_LLVM_entry_value, N``), the
+ implicit operation which pushes the first debug argument of the containing
+ marker/pseudo is used instead. The value ``N`` must always be at least ``1``,
+ as this first operation cannot be omitted and is counted in ``N``.
+
+ The rest of the sub-expression comprises the ``(N - 1)`` operations following
+ ``DW_OP_LLVM_entry_value, N`` in the ``DIExpression``.
+
+ Due to framework limitations:
+
+ - ``N`` must not be greater than ``1``. In other words, ``N`` must equal
+ ``1``, and the sub-expression comprises only the operation immediately
+ preceding ``DW_OP_LLVM_entry_value, N``.
+ - ``DW_OP_LLVM_entry_value, N`` must be either the first operation of a
+ ``DIExpression`` or the second operation if the expression begins with
+ ``DW_OP_LLVM_arg, 0``.
+ - The first operation must refer to a register value.
+
+ Taken together, these limitations mean that ``DW_OP_LLVM_entry_value`` can
+ only currently be used to push the value a single register had on entry to
+ the current stack frame.
+
+ For example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_entry_value, 1,
+ DW_OP_LLVM_arg, 1, DW_OP_plus, DW_OP_stack_value)`` specifies an expression
+ where the entry value of the first argument to the ``DIExpression`` is added
+ to the non-entry value of the second argument, and the result is used as the
+ value for an implicit value location.
+
+ When targeting DWARF, a ``DBG_VALUE(reg, ...,
+ DIExpression(DW_OP_LLVM_entry_value, 1, ...)`` is lowered to
+ ``DW_OP_entry_value [reg], ...``, which pushes the value ``reg`` had upon
+ frame entry onto the DWARF expression stack.
+
+ Because ``DW_OP_LLVM_entry_value`` is currently limited registers, it is
----------------
t-tye wrote:
limited registers -> limited to registers
https://github.com/llvm/llvm-project/pull/153072
More information about the llvm-commits
mailing list