[llvm] 1eac2c5 - [AMDGPU] Move DWARF proposal to separate file
Michael Kruse via llvm-commits
llvm-commits at lists.llvm.org
Tue May 12 07:53:57 PDT 2020
I think this broke the the docs build.
Also see
http://lab.llvm.org:8011/builders/llvm-sphinx-docs/builds/44974/steps/docs-llvm-html/logs/stdio
[1/1] Generating html Sphinx documentation for llvm into
"/home/meinersbur/build/llvm-project/release/docs/html"
FAILED: docs/CMakeFiles/docs-llvm-html
Warning, treated as error:
/home/meinersbur/src/llvm-project/llvm/docs/AMDGPUDwarfProposalForHeterogeneousDebugging.rst:394:duplicate
label amdgpu-dwarf-expressions, other instance in
/home/meinersbur/src/llvm-project/llvm/docs/AMDGPUUsage.rst
ninja: build stopped: subcommand failed.
Am Mi., 15. Apr. 2020 um 16:18 Uhr schrieb via llvm-commits
<llvm-commits at lists.llvm.org>:
>
>
> Author: Tony
> Date: 2020-04-15T17:19:39-04:00
> New Revision: 1eac2c55d861dfc6d88308ad97c242cbd60e5da1
>
> URL: https://github.com/llvm/llvm-project/commit/1eac2c55d861dfc6d88308ad97c242cbd60e5da1
> DIFF: https://github.com/llvm/llvm-project/commit/1eac2c55d861dfc6d88308ad97c242cbd60e5da1.diff
>
> LOG: [AMDGPU] Move DWARF proposal to separate file
>
> - Move DWARF proposal for heterogeneous debugging to a separate file.
> - Add references.
>
> Differential Revision: https://reviews.llvm.org/D70523
>
> Added:
> llvm/docs/AMDGPUDwarfProposalForHeterogeneousDebugging.rst
>
> Modified:
> llvm/docs/AMDGPUUsage.rst
> llvm/docs/UserGuides.rst
>
> Removed:
>
>
>
> ################################################################################
> diff --git a/llvm/docs/AMDGPUDwarfProposalForHeterogeneousDebugging.rst b/llvm/docs/AMDGPUDwarfProposalForHeterogeneousDebugging.rst
> new file mode 100644
> index 000000000000..537359fec55c
> --- /dev/null
> +++ b/llvm/docs/AMDGPUDwarfProposalForHeterogeneousDebugging.rst
> @@ -0,0 +1,3783 @@
> +.. _amdgpu-dwarf-6-proposal-for-heterogeneous-debugging:
> +
> +====================================================
> +DWARF Version 6 Proposal For Heterogeneous Debugging
> +====================================================
> +
> +.. contents::
> + :local:
> +
> +.. warning::
> +
> + This section describes a **provisional proposal** for DWARF Version 6
> + [:ref:`DWARF <amdgpu-dwarf-DWARF>`] to support heterogeneous debugging. It is
> + not currently fully implemented and is subject to change.
> +
> +Introduction
> +------------
> +
> +This document proposes a set of backwards compatible extensions to DWARF Version
> +5 [:ref:`DWARF <amdgpu-dwarf-DWARF>`] for consideration of inclusion into a
> +future DWARF Version 6 standard to support heterogeneous debugging.
> +
> +The remainder of this section provides motivation for each proposed feature in
> +terms of heterogeneous debugging on commercially available AMD GPU hardware
> +(AMDGPU). The goal is to add support to the AMD [:ref:`AMD <amdgpu-dwarf-AMD>`]
> +open source Radeon Open Compute Platform (ROCm) [:ref:`AMD-ROCm
> +<amdgpu-dwarf-AMD-ROCm>`] which is an implementation of the industry standard
> +for heterogeneous computing devices defined by the Heterogeneous System
> +Architecture (HSA) Foundation [:ref:`HSA <amdgpu-dwarf-HSA>`]. ROCm includes the
> +LLVM compiler [:ref:`LLVM <amdgpu-dwarf-LLVM>`] with upstreamed support for
> +AMDGPU [:ref:`AMDGPU-LLVM <amdgpu-dwarf-AMDGPU-LLVM>`]. The goal is to also add
> +the GDB debugger [:ref:`GDB <amdgpu-dwarf-GDB>`] with upstreamed support for
> +AMDGPU [:ref:`AMD-ROCgdb <amdgpu-dwarf-AMD-ROCgdb>`]. In addition, the goal is
> +to work with third parties to enable support for AMDGPU debugging in the GCC
> +compiler [:ref:`GCC <amdgpu-dwarf-GCC>`] and the Perforce TotalView HPC debugger
> +[:ref:`Perforce-TotalView <amdgpu-dwarf-Perforce-TotalView>`].
> +
> +However, the proposal is intended to be vendor and architecture neutral. It is
> +believed to apply to other heterogeous hardware devices including GPUs, DSPs,
> +FPGAs, and other specialized hardware. These collectively include similar
> +characteristics and requirements as AMDGPU devices. Parts of the proposal can
> +also apply to traditional CPU hardware that supports large vector registers.
> +Compilers can map source languages and extensions that describe large scale
> +parallel execution onto the lanes of the vector registers. This is common in
> +programming languages used in ML and HPC. The proposal also includes improved
> +support for optimized code on any architecture. Some of the generalizations may
> +also benefit other issues that have been raised.
> +
> +The proposal has evolved though collaboration with many individuals and active
> +prototyping within the GDB debugger and LLVM compiler. Input has also been very
> +much appreciated from the developers working on the Perforce TotalView HPC
> +Debugger and GCC compiler.
> +
> +The AMDGPU has several features that require additional DWARF functionality in
> +order to support optimized code.
> +
> +AMDGPU optimized code may spill vector registers to non-global address space
> +memory, and this spilling may be done only for lanes that are active on entry
> +to the subprogram. To support this, a location description that can be created
> +as a masked select is required. See ``DW_OP_LLVM_select_bit_piece``.
> +
> +Since the active lane mask may be held in a register, a way to get the value
> +of a register on entry to a subprogram is required. To support this an
> +operation that returns the caller value of a register as specified by the Call
> +Frame Information (CFI) is required. See ``DW_OP_LLVM_call_frame_entry_reg``
> +and :ref:`amdgpu-dwarf-call-frame-information`.
> +
> +Current DWARF uses an empty expression to indicate an undefined location
> +description. Since the masked select composite location description operation
> +takes more than one location description, it is necessary to have an explicit
> +way to specify an undefined location description. Otherwise it is not possible
> +to specify that a particular one of the input location descriptions is
> +undefined. See ``DW_OP_LLVM_undefined``.
> +
> +CFI describes restoring callee saved registers that are spilled. Currently CFI
> +only allows a location description that is a register, memory address, or
> +implicit location description. AMDGPU optimized code may spill scalar
> +registers into portions of vector registers. This requires extending CFI to
> +allow any location description. See
> +:ref:`amdgpu-dwarf-call-frame-information`.
> +
> +The vector registers of the AMDGPU are represented as their full wavefront
> +size, meaning the wavefront size times the dword size. This reflects the
> +actual hardware and allows the compiler to generate DWARF for languages that
> +map a thread to the complete wavefront. It also allows more efficient DWARF to
> +be generated to describe the CFI as only a single expression is required for
> +the whole vector register, rather than a separate expression for each lane's
> +dword of the vector register. It also allows the compiler to produce DWARF
> +that indexes the vector register if it spills scalar registers into portions
> +of a vector registers.
> +
> +Since DWARF stack value entries have a base type and AMDGPU registers are a
> +vector of dwords, the ability to specify that a base type is a vector is
> +required. See ``DW_AT_LLVM_vector_size``.
> +
> +If the source language is mapped onto the AMDGPU wavefronts in a SIMT manner,
> +then the variable DWARF location expressions must compute the location for a
> +single lane of the wavefront. Therefore, a DWARF operation is required to
> +denote the current lane, much like ``DW_OP_push_object_address`` denotes the
> +current object. The ``DW_OP_*piece`` operations only allow literal indices.
> +Therefore, a way to use a computed offset of an arbitrary location description
> +(such as a vector register) is required. See ``DW_OP_LLVM_push_lane``,
> +``DW_OP_LLVM_offset``, ``DW_OP_LLVM_offset_constu``, and
> +``DW_OP_LLVM_bit_offset``.
> +
> +If the source language is mapped onto the AMDGPU wavefronts in a SIMT manner
> +the compiler can use the AMDGPU execution mask register to control which lanes
> +are active. To describe the conceptual location of non-active lanes a DWARF
> +expression is needed that can compute a per lane PC. For efficiency, this is
> +done for the wavefront as a whole. This expression benefits by having a masked
> +select composite location description operation. This requires an attribute
> +for source location of each lane. The AMDGPU may update the execution mask for
> +whole wavefront operations and so needs an attribute that computes the current
> +active lane mask. See ``DW_OP_LLVM_select_bit_piece``, ``DW_OP_LLVM_extend``,
> +``DW_AT_LLVM_lane_pc``, and ``DW_AT_LLVM_active_lane``.
> +
> +AMDGPU needs to be able to describe addresses that are in
> diff erent kinds of
> +memory. Optimized code may need to describe a variable that resides in pieces
> +that are in
> diff erent kinds of storage which may include parts of registers,
> +memory that is in a mixture of memory kinds, implicit values, or be undefined.
> +DWARF has the concept of segment addresses. However, the segment cannot be
> +specified within a DWARF expression, which is only able to specify the offset
> +portion of a segment address. The segment index is only provided by the entity
> +that specifies the DWARF expression. Therefore, the segment index is a
> +property that can only be put on complete objects, such as a variable. That
> +makes it only suitable for describing an entity (such as variable or
> +subprogram code) that is in a single kind of memory. Therefore, AMDGPU uses
> +the DWARF concept of address spaces. For example, a variable may be allocated
> +in a register that is partially spilled to the call stack which is in the
> +private address space, and partially spilled to the local address space.
> +
> +DWARF uses the concept of an address in many expression operations but does not
> +define how it relates to address spaces. For example,
> +``DW_OP_push_object_address`` pushes the address of an object. Other contexts
> +implicitly push an address on the stack before evaluating an expression. For
> +example, the ``DW_AT_use_location`` attribute of the
> +``DW_TAG_ptr_to_member_type``. The expression that uses the address needs to
> +do so in a general way and not need to be dependent on the address space of
> +the address. For example, a pointer to member value may want to be applied to
> +an object that may reside in any address space.
> +
> +The number of registers and the cost of memory operations is much higher for
> +AMDGPU than a typical CPU. The compiler attempts to optimize whole variables
> +and arrays into registers. Currently DWARF only allows
> +``DW_OP_push_object_address`` and related operations to work with a global
> +memory location. To support AMDGPU optimized code it is required to generalize
> +DWARF to allow any location description to be used. This allows registers, or
> +composite location descriptions that may be a mixture of memory, registers, or
> +even implicit values.
> +
> +DWARF Version 5 does not allow location descriptions to be entries on the
> +DWARF stack. They can only be the final result of the evaluation of a DWARF
> +expression. However, by allowing a location description to be a first-class
> +entry on the DWARF stack it becomes possible to compose expressions containing
> +both values and location descriptions naturally. It allows objects to be
> +located in any kind of memory address space, in registers, be implicit values,
> +be undefined, or a composite of any of these. By extending DWARF carefully,
> +all existing DWARF expressions can retain their current semantic meaning.
> +DWARF has implicit conversions that convert from a value that represents an
> +address in the default address space to a memory location description. This
> +can be extended to allow a default address space memory location description
> +to be implicitly converted back to its address value. This allows all DWARF
> +Version 5 expressions to retain their same meaning, while adding the ability
> +to explicitly create memory location descriptions in non-default address
> +spaces and generalizing the power of composite location descriptions to any
> +kind of location description. See :ref:`amdgpu-dwarf-operation-expressions`.
> +
> +To allow composition of composite location descriptions, an explicit operation
> +that indicates the end of the definition of a composite location description
> +is required. This can be implied if the end of a DWARF expression is reached,
> +allowing current DWARF expressions to remain legal. See
> +``DW_OP_LLVM_piece_end``.
> +
> +The ``DW_OP_plus`` and ``DW_OP_minus`` can be defined to operate on a memory
> +location description in the default target architecture specific address space
> +and a generic type value to produce an updated memory location description.
> +This allows them to continue to be used to offset an address. To generalize
> +offsetting to any location description, including location descriptions that
> +describe when bytes are in registers, are implicit, or a composite of these,
> +the ``DW_OP_LLVM_offset``, ``DW_OP_LLVM_offset_constu`` and
> +``DW_OP_LLVM_bit_offset`` operations are added. These do not perform wrapping
> +which would be hard to define for location descriptions of non-memory kinds.
> +This allows ``DW_OP_push_object_address`` to push a location description that
> +may be in a register, or be an implicit value, and the DWARF expression of
> +``DW_TAG_ptr_to_member_type`` can contain ``DW_OP_LLVM_offset`` to offset
> +within it. ``DW_OP_LLVM_bit_offset`` generalizes DWARF to work with bit fields
> +which is not possible in DWARF Version 5.
> +
> +The DWARF ``DW_OP_xderef*`` operations allow a value to be converted into an
> +address of a specified address space which is then read. But it provides no
> +way to create a memory location description for an address in the non-default
> +address space. For example, AMDGPU variables can be allocated in the local
> +address space at a fixed address. It is required to have an operation to
> +create an address in a specific address space that can be used to define the
> +location description of the variable. Defining this operation to produce a
> +location description allows the size of addresses in an address space to be
> +larger than the generic type. See ``DW_OP_LLVM_form_aspace_address``.
> +
> +If the ``DW_OP_LLVM_form_aspace_address`` operation had to produce a value
> +that can be implicitly converted to a memory location description, then it
> +would be limited to the size of the generic type which matches the size of the
> +default address space. Its value would be unspecified and likely not match any
> +value in the actual program. By making the result a location description, it
> +allows a consumer great freedom in how it implements it. The implicit
> +conversion back to a value can be limited only to the default address space to
> +maintain compatibility with DWARF Version 5. For other address spaces the
> +producer can use the new operations that explicitly specify the address space.
> +
> +``DW_OP_breg*`` treats the register as containing an address in the default
> +address space. It is required to be able to specify the address space of the
> +register value. See ``DW_OP_LLVM_aspace_bregx``.
> +
> +Similarly, ``DW_OP_implicit_pointer`` treats its implicit pointer value as
> +being in the default address space. It is required to be able to specify the
> +address space of the pointer value. See
> +``DW_OP_LLVM_aspace_implicit_pointer``.
> +
> +Almost all uses of addresses in DWARF are limited to defining location
> +descriptions, or to be dereferenced to read memory. The exception is
> +``DW_CFA_val_offset`` which uses the address to set the value of a register.
> +By defining the CFA DWARF expression as being a memory location description,
> +it can maintain what address space it is, and that can be used to convert the
> +offset address back to an address in that address space. See
> +:ref:`amdgpu-dwarf-call-frame-information`.
> +
> +This approach allows all existing DWARF to have the identical semantics. It
> +allows the compiler to explicitly specify the address space it is using. For
> +example, a compiler could choose to access private memory in a swizzled manner
> +when mapping a source language to a wavefront in a SIMT manner, or to access
> +it in an unswizzled manner if mapping the same language with the wavefront
> +being the thread. It also allows the compiler to mix the address space it uses
> +to access private memory. For example, for SIMT it can still spill entire
> +vector registers in an unswizzled manner, while using a swizzled private
> +memory for SIMT variable access. This approach allows memory location
> +descriptions for
> diff erent address spaces to be combined using the regular
> +``DW_OP_*piece`` operations.
> +
> +Location descriptions are an abstraction of storage, they give freedom to the
> +consumer on how to implement them. They allow the address space to encode lane
> +information so they can be used to read memory with only the memory
> +description and no extra arguments. The same set of operations can operate on
> +locations independent of their kind of storage. The ``DW_OP_deref*`` therefore
> +can be used on any storage kind. ``DW_OP_xderef*`` is unnecessary except to
> +become a more compact way to convert a non-default address space address
> +followed by dereferencing it.
> +
> +In DWARF Version 5 a location description is defined as a single location
> +description or a location list. A location list is defined as either
> +effectively an undefined location description or as one or more single
> +location descriptions to describe an object with multiple places. The
> +``DW_OP_push_object_address`` and ``DW_OP_call*`` operations can put a
> +location description on the stack. Furthermore, debugger information entry
> +attributes such as ``DW_AT_data_member_location``, ``DW_AT_use_location``, and
> +``DW_AT_vtable_elem_location`` are defined as pushing a location description
> +on the expression stack before evaluating the expression. However, DWARF
> +Version 5 only allows the stack to contain values and so only a single memory
> +address can be on the stack which makes these incapable of handling location
> +descriptions with multiple places, or places other than memory. Since this
> +proposal allows the stack to contain location descriptions, the operations are
> +generalized to support location descriptions that can have multiple places.
> +This is backwards compatible with DWARF Version 5 and allows objects with
> +multiple places to be supported. For example, the expression that describes
> +how to access the field of an object can be evaluated with a location
> +description that has multiple places and will result in a location description
> +with multiple places as expected. With this change, the separate DWARF Version
> +5 sections that described DWARF expressions and location lists have been
> +unified into a single section that describes DWARF expressions in general.
> +This unification seems to be a natural consequence and a necessity of allowing
> +location descriptions to be part of the evaluation stack.
> +
> +For those familiar with the definition of location descriptions in DWARF
> +Version 5, the definition in this proposal is presented
> diff erently, but does
> +in fact define the same concept with the same fundamental semantics. However,
> +it does so in a way that allows the concept to extend to support address
> +spaces, bit addressing, the ability for composite location descriptions to be
> +composed of any kind of location description, and the ability to support
> +objects located at multiple places. Collectively these changes expand the set
> +of processors that can be supported and improves support for optimized code.
> +
> +Several approaches were considered, and the one proposed appears to be the
> +cleanest and offers the greatest improvement of DWARF's ability to support
> +optimized code. Examining the GDB debugger and LLVM compiler, it appears only
> +to require modest changes as they both already have to support general use of
> +location descriptions. It is anticipated that will also be the case for other
> +debuggers and compilers.
> +
> +As an experiment, GDB was modified to evaluate DWARF Version 5 expressions
> +with location descriptions as stack entries and implicit conversions. All GDB
> +tests have passed, except one that turned out to be an invalid test by DWARF
> +Version 5 rules. The code in GDB actually became simpler as all evaluation was
> +on the stack and there was no longer a need to maintain a separate structure
> +for the location description result. This gives confidence of the backwards
> +compatibility.
> +
> +Since the AMDGPU supports languages such as OpenCL [:ref:`OpenCL
> +<amdgpu-dwarf-OpenCL>`], there is a need to define source language address
> +classes so they can be used in a consistent way by consumers. It would also be
> +desirable to add support for using them in defining language types rather than
> +the current target architecture specific address spaces. See
> +:ref:`amdgpu-dwarf-segment_addresses`.
> +
> +A ``DW_AT_LLVM_augmentation`` attribute is added to a compilation unit
> +debugger information entry to indicate that there is additional target
> +architecture specific information in the debugging information entries of that
> +compilation unit. This allows a consumer to know what extensions are present
> +in the debugger information entries as is possible with the augmentation
> +string of other sections. The format that should be used for the augmentation
> +string in the lookup by name table and CFI Common Information Entry is also
> +recommended to allow a consumer to parse the string when it contains
> +information from multiple vendors.
> +
> +The AMDGPU supports programming languages that include online compilation
> +where the source text may be created at runtime. Therefore, a way to embed the
> +source text in the debug information is required. For example, the OpenCL
> +language runtime supports online compilation. See
> +:ref:`amdgpu-dwarf-line-number-information`.
> +
> +Support to allow MD5 checksums to be optionally present in the line table is
> +added. This allows linking together compilation units where some have MD5
> +checksums and some do not. In DWARF Version 5 the file timestamp and file size
> +can be optional, but if the MD5 checksum is present it must be valid for all
> +files. See :ref:`amdgpu-dwarf-line-number-information`.
> +
> +Support is added for the HIP programming language [:ref:`HIP
> +<amdgpu-dwarf-HIP>`] which is supported by the AMDGPU. See
> +:ref:`amdgpu-dwarf-language-names`.
> +
> +The following sections provide the definitions for the additional operations,
> +as well as clarifying how existing expression operations, CFI operations, and
> +attributes behave with respect to generalized location descriptions that
> +support address spaces and location descriptions that support multiple places.
> +It has been defined such that it is backwards compatible with DWARF Version 5.
> +The definitions are intended to fully define well-formed DWARF in a consistent
> +style based on the DWARF Version 5 specification. Non-normative text is shown
> +in *italics*.
> +
> +The names for the new operations, attributes, and constants include "\
> +``LLVM``\ " and are encoded with vendor specific codes so this proposal can be
> +implemented as an LLVM vendor extension to DWARF Version 5. If accepted these
> +names would not include the "\ ``LLVM``\ " and would not use encodings in the
> +vendor range.
> +
> +The proposal is organized to follow the section ordering of DWARF Version 5.
> +It includes notes to indicate the corresponding DWARF Version 5 sections to
> +which they pertain. Other notes describe additional changes that may be worth
> +considering, and to raise questions.
> +
> +General Description
> +-------------------
> +
> +Attribute Types
> +~~~~~~~~~~~~~~~
> +
> +.. note::
> +
> + This augments DWARF Version 5 section 2.2 and Table 2.2.
> +
> +The following table provides the additional attributes. See
> +:ref:`amdgpu-dwarf-debugging-information-entry-attributes`.
> +
> +.. table:: Attribute names
> + :name: amdgpu-dwarf-attribute-names-table
> +
> + =========================== ====================================
> + Attribute Usage
> + =========================== ====================================
> + ``DW_AT_LLVM_active_lane`` SIMD or SIMT active lanes
> + ``DW_AT_LLVM_augmentation`` Compilation unit augmentation string
> + ``DW_AT_LLVM_lane_pc`` SIMD or SIMT lane program location
> + ``DW_AT_LLVM_lanes`` SIMD or SIMT thread lane count
> + ``DW_AT_LLVM_vector_size`` Base type vector size
> + =========================== ====================================
> +
> +.. _amdgpu-dwarf-expressions:
> +
> +DWARF Expressions
> +~~~~~~~~~~~~~~~~~
> +
> +.. note::
> +
> + This section, and its nested sections, replaces DWARF Version 5 section 2.5 and
> + section 2.6. The new proposed DWARF expression operations are defined as well
> + as clarifying the extensions to already existing DWARF Version 5 operations. It is
> + based on the text of the existing DWARF Version 5 standard.
> +
> +DWARF expressions describe how to compute a value or specify a location.
> +
> +*The evaluation of a DWARF expression can provide the location of an object, the
> +value of an array bound, the length of a dynamic string, the desired value
> +itself, and so on.*
> +
> +The evaluation of a DWARF expression can either result in a value or a location
> +description:
> +
> +*value*
> +
> + A value has a type and a literal value. It can represent a literal value of
> + any supported base type of the target architecture. The base type specifies
> + the size and encoding of the literal value.
> +
> + .. note::
> +
> + It may be desirable to add an implicit pointer base type encoding. It would
> + be used for the type of the value that is produced when the ``DW_OP_deref*``
> + operation retrieves the full contents of an implicit pointer location
> + storage created by the ``DW_OP_implicit_pointer`` or
> + ``DW_OP_LLVM_aspace_implicit_pointer`` operations. The literal value would
> + record the debugging information entry and byte dispacement specified by the
> + associated ``DW_OP_implicit_pointer`` or
> + ``DW_OP_LLVM_aspace_implicit_pointer`` operations.
> +
> + Instead of a base type, a value can have a distinguished generic type, which
> + is an integral type that has the size of an address in the target architecture
> + default address space and unspecified signedness.
> +
> + *The generic type is the same as the unspecified type used for stack
> + operations defined in DWARF Version 4 and before.*
> +
> + An integral type is a base type that has an encoding of ``DW_ATE_signed``,
> + ``DW_ATE_signed_char``, ``DW_ATE_unsigned``, ``DW_ATE_unsigned_char``,
> + ``DW_ATE_boolean``, or any target architecture defined integral encoding in
> + the inclusive range ``DW_ATE_lo_user`` to ``DW_ATE_hi_user``.
> +
> + .. note::
> +
> + It is unclear if ``DW_ATE_address`` is an integral type. GDB does not seem
> + to consider it as integral.
> +
> +*location description*
> +
> + *Debugging information must provide consumers a way to find the location of
> + program variables, determine the bounds of dynamic arrays and strings, and
> + possibly to find the base address of a subprogram’s stack frame or the return
> + address of a subprogram. Furthermore, to meet the needs of recent computer
> + architectures and optimization techniques, debugging information must be able
> + to describe the location of an object whose location changes over the object’s
> + lifetime, and may reside at multiple locations simultaneously during parts of
> + an object's lifetime.*
> +
> + Information about the location of program objects is provided by location
> + descriptions.
> +
> + Location descriptions can consist of one or more single location descriptions.
> +
> + A single location description specifies the location storage that holds a
> + program object and a position within the location storage where the program
> + object starts. The position within the location storage is expressed as a bit
> + offset relative to the start of the location storage.
> +
> + A location storage is a linear stream of bits that can hold values. Each
> + location storage has a size in bits and can be accessed using a zero-based bit
> + offset. The ordering of bits within a location storage uses the bit numbering
> + and direction conventions that are appropriate to the current language on the
> + target architecture.
> +
> + There are five kinds of location storage:
> +
> + *memory location storage*
> + Corresponds to the target architecture memory address spaces.
> +
> + *register location storage*
> + Corresponds to the target architecture registers.
> +
> + *implicit location storage*
> + Corresponds to fixed values that can only be read.
> +
> + *undefined location storage*
> + Indicates no value is available and therefore cannot be read or written.
> +
> + *composite location storage*
> + Allows a mixture of these where some bits come from one location storage and
> + some from another location storage, or from disjoint parts of the same
> + location storage.
> +
> + .. note::
> +
> + It may be better to add an implicit pointer location storage kind used by
> + the ``DW_OP_implicit_pointer`` and ``DW_OP_LLVM_aspace_implicit_pointer``
> + operations. It would specify the debugger information entry and byte offset
> + provided by the operations.
> +
> + *Location descriptions are a language independent representation of addressing
> + rules. They are created using DWARF operation expressions of arbitrary
> + complexity. They can be the result of evaluting a debugger information entry
> + attribute that specifies an operation expression. In this usage they can
> + describe the location of an object as long as its lifetime is either static or
> + the same as the lexical block (see DWARF Version 5 section 3.5) that owns it,
> + and it does not move during its lifetime. They can be the result of evaluating
> + a debugger information entry attribute that specifies a location list
> + expression. In this usage they can describe the location of an object that has
> + a limited lifetime, changes its location during its lifetime, or has multiple
> + locations over part or all of its lifetime.*
> +
> + If a location description has more than one single location description, the
> + DWARF expression is ill-formed if the object value held in each single
> + location description's position within the associated location storage is not
> + the same value, except for the parts of the value that are uninitialized.
> +
> + *A location description that has more than one single location description can
> + only be created by a location list expression that has overlapping program
> + location ranges, or certain expression operations that act on a location
> + description that has more than one single location description. There are no
> + operation expression operations that can directly create a location
> + description with more than one single location description.*
> +
> + *A location description with more than one single location description can be
> + used to describe objects that reside in more than one piece of storage at the
> + same time. An object may have more than one location as a result of
> + optimization. For example, a value that is only read may be promoted from
> + memory to a register for some region of code, but later code may revert to
> + reading the value from memory as the register may be used for other purposes.
> + For the code region where the value is in a register, any change to the object
> + value must be made in both the register and the memory so both regions of code
> + will read the updated value.*
> +
> + *A consumer of a location description with more than one single location
> + description can read the object's value from any of the single location
> + descriptions (since they all refer to location storage that has the same
> + value), but must write any changed value to all the single location
> + descriptions.*
> +
> +A DWARF expression can either be encoded as a operation expression (see
> +:ref:`amdgpu-dwarf-operation-expressions`), or as a location list expression
> +(see :ref:`amdgpu-dwarf-location-list-expressions`).
> +
> +A DWARF expression is evaluated in the context of:
> +
> +*A current subprogram*
> + This may be used in the evaluation of register access operations to support
> + virtual unwinding of the call stack (see
> + :ref:`amdgpu-dwarf-call-frame-information`).
> +
> +*A current program location*
> + This may be used in the evaluation of location list expressions to select
> + amongst multiple program location ranges. It should be the program location
> + corresponding to the current subprogram. If the current subprogram was reached
> + by virtual call stack unwinding, then the program location will correspond to
> + the associated call site.
> +
> +*An initial stack*
> + This is a list of values or location descriptions that will be pushed on the
> + operation expression evaluation stack in the order provided before evaluation
> + of an operation expression starts.
> +
> + Some debugger information entries have attributes that evaluate their DWARF
> + expression value with initial stack entries. In all other cases the initial
> + stack is empty.
> +
> +When a DWARF expression is evaluated, it may be specified whether a value or
> +location description is required as the result kind.
> +
> +If a result kind is specified, and the result of the evaluation does not match
> +the specified result kind, then the implicit conversions described in
> +:ref:`amdgpu-dwarf-memory-location-description-operations` are performed if
> +valid. Otherwise, the DWARF expression is ill-formed.
> +
> +.. _amdgpu-dwarf-operation-expressions:
> +
> +DWARF Operation Expressions
> ++++++++++++++++++++++++++++
> +
> +An operation expression is comprised of a stream of operations, each consisting
> +of an opcode followed by zero or more operands. The number of operands is
> +implied by the opcode.
> +
> +Operations represent a postfix operation on a simple stack machine. Each stack
> +entry can hold either a value or a location description. Operations can act on
> +entries on the stack, including adding entries and removing entries. If the kind
> +of a stack entry does not match the kind required by the operation and is not
> +implicitly convertible to the required kind (see
> +:ref:`amdgpu-dwarf-memory-location-description-operations`), then the DWARF
> +operation expression is ill-formed.
> +
> +Evaluation of an operation expression starts with an empty stack on which the
> +entries from the initial stack provided by the context are pushed in the order
> +provided. Then the operations are evaluated, starting with the first operation
> +of the stream, until one past the last operation of the stream is reached. The
> +result of the evaluation is:
> +
> +* If evaluation of the DWARF expression requires a location description, then:
> +
> + * If the stack is empty, the result is a location description with one
> + undefined location description.
> +
> + *This rule is for backwards compatibility with DWARF Version 5 which has no
> + explicit operation to create an undefined location description, and uses an
> + empty operation expression for this purpose.*
> +
> + * If the top stack entry is a location description, or can be converted
> + to one, then the result is that, possibly converted, location description.
> + Any other entries on the stack are discarded.
> +
> + * Otherwise the DWARF expression is ill-formed.
> +
> + .. note::
> +
> + Could define this case as returning an implicit location description as
> + if the ``DW_OP_implicit`` operation is performed.
> +
> +* If evaluation of the DWARF expression requires a value, then:
> +
> + * If the top stack entry is a value, or can be converted to one, then the
> + result is that, possibly converted, value. Any other entries on the stack
> + are discarded.
> +
> + * Otherwise the DWARF expression is ill-formed.
> +
> +* If evaluation of the DWARF expression does not specify if a value or location
> + description is required, then:
> +
> + * If the stack is empty, the result is a location description with one
> + undefined location description.
> +
> + *This rule is for backwards compatibility with DWARF Version 5 which has no
> + explicit operation to create an undefined location description, and uses an
> + empty operation expression for this purpose.*
> +
> + .. note::
> +
> + This rule is consistent with the rule above for when a location
> + description is requested. However, GDB appears to report this as an error
> + and no GDB tests appear to cause an empty stack for this case.
> +
> + * Otherwise, the top stack entry is returned. Any other entries on the stack
> + are discarded.
> +
> +An operation expression is encoded as a byte block with some form of prefix that
> +specifies the byte count. It can be used:
> +
> +* as the value of a debugging information entry attribute that is encoded using
> + class ``exprloc`` (see DWARF Version 5 section 7.5.5),
> +
> +* as the operand to certain operation expression operations,
> +
> +* as the operand to certain call frame information operations (see
> + :ref:`amdgpu-dwarf-call-frame-information`),
> +
> +* and in location list entries (see
> + :ref:`amdgpu-dwarf-location-list-expressions`).
> +
> +.. _amdgpu-dwarf-stack-operations:
> +
> +Stack Operations
> +################
> +
> +The following operations manipulate the DWARF stack. Operations that index the
> +stack assume that the top of the stack (most recently added entry) has index 0.
> +They allow the stack entries to be either a value or location description.
> +
> +If any stack entry accessed by a stack operation is an incomplete composite
> +location description, then the DWARF expression is ill-formed.
> +
> +.. note::
> +
> + These operations now support stack entries that are values and location
> + descriptions.
> +
> +.. note::
> +
> + If it is desired to also make them work with incomplete composite location
> + descriptions, then would need to define that the composite location storage
> + specified by the incomplete composite location description is also replicated
> + when a copy is pushed. This ensures that each copy of the incomplete composite
> + location description can update the composite location storage they specify
> + independently.
> +
> +1. ``DW_OP_dup``
> +
> + ``DW_OP_dup`` duplicates the stack entry at the top of the stack.
> +
> +2. ``DW_OP_drop``
> +
> + ``DW_OP_drop`` pops the stack entry at the top of the stack and discards it.
> +
> +3. ``DW_OP_pick``
> +
> + ``DW_OP_pick`` has a single unsigned 1-byte operand that represents an index
> + I. A copy of the stack entry with index I is pushed onto the stack.
> +
> +4. ``DW_OP_over``
> +
> + ``DW_OP_over`` pushes a copy of the entry with index 1.
> +
> + *This is equivalent to a ``DW_OP_pick 1`` operation.*
> +
> +5. ``DW_OP_swap``
> +
> + ``DW_OP_swap`` swaps the top two stack entries. The entry at the top of the
> + stack becomes the second stack entry, and the second stack entry becomes the
> + top of the stack.
> +
> +6. ``DW_OP_rot``
> +
> + ``DW_OP_rot`` rotates the first three stack entries. The entry at the top of
> + the stack becomes the third stack entry, the second entry becomes the top of
> + the stack, and the third entry becomes the second entry.
> +
> +.. _amdgpu-dwarf-control-flow-operations:
> +
> +Control Flow Operations
> +#######################
> +
> +The following operations provide simple control of the flow of a DWARF operation
> +expression.
> +
> +1. ``DW_OP_nop``
> +
> + ``DW_OP_nop`` is a place holder. It has no effect on the DWARF stack
> + entries.
> +
> +2. ``DW_OP_le``, ``DW_OP_ge``, ``DW_OP_eq``, ``DW_OP_lt``, ``DW_OP_gt``,
> + ``DW_OP_ne``
> +
> + .. note::
> +
> + The same as in DWARF Version 5 section 2.5.1.5.
> +
> +3. ``DW_OP_skip``
> +
> + ``DW_OP_skip`` is an unconditional branch. Its single operand is a 2-byte
> + signed integer constant. The 2-byte constant is the number of bytes of the
> + DWARF expression to skip forward or backward from the current operation,
> + beginning after the 2-byte constant.
> +
> + If the updated position is at one past the end of the last operation, then
> + the operation expression evaluation is complete.
> +
> + Otherwise, the DWARF expression is ill-formed if the updated operation
> + position is not in the range of the first to last operation inclusive, or
> + not at the start of an operation.
> +
> +4. ``DW_OP_bra``
> +
> + ``DW_OP_bra`` is a conditional branch. Its single operand is a 2-byte signed
> + integer constant. This operation pops the top of stack. If the value popped
> + is not the constant 0, the 2-byte constant operand is the number of bytes of
> + the DWARF operation expression to skip forward or backward from the current
> + operation, beginning after the 2-byte constant.
> +
> + If the updated position is at one past the end of the last operation, then
> + the operation expression evaluation is complete.
> +
> + Otherwise, the DWARF expression is ill-formed if the updated operation
> + position is not in the range of the first to last operation inclusive, or
> + not at the start of an operation.
> +
> +5. ``DW_OP_call2, DW_OP_call4, DW_OP_call_ref``
> +
> + ``DW_OP_call2``, ``DW_OP_call4``, and ``DW_OP_call_ref`` perform DWARF
> + procedure calls during evaluation of a DWARF expression.
> +
> + ``DW_OP_call2`` and ``DW_OP_call4``, have one operand that is a 2- or 4-byte
> + unsigned offset, respectively, of a debugging information entry D in the
> + current compilation unit.
> +
> + ``DW_OP_LLVM_call_ref`` has one operand that is a 4-byte unsigned value in
> + the 32-bit DWARF format, or an 8-byte unsigned value in the 64-bit DWARF
> + format, that represents an offset of a debugging information entry D in a
> + ``.debug_info`` section, which may be contained in an executable or shared
> + object file other than that containing the operation. For references from one
> + executable or shared object file to another, the relocation must be
> + performed by the consumer.
> +
> + *Operand interpretation of* ``DW_OP_call2``\ *,* ``DW_OP_call4``\ *, and*
> + ``DW_OP_call_ref`` *is exactly like that for* ``DW_FORM_ref2``\ *,
> + ``DW_FORM_ref4``\ *, and* ``DW_FORM_ref_addr``\ *, respectively.*
> +
> + The call operation is evaluated by:
> +
> + * If D has a ``DW_AT_location`` attribute that is encoded as a ``exprloc``
> + that specifies an operation expression E, then execution of the current
> + operation expression continues from the first operation of E. Execution
> + continues until one past the last operation of E is reached, at which
> + point execution continues with the operation following the call operation.
> + Since E is evaluated on the same stack as the call, E can use, add, and/or
> + remove entries already on the stack.
> +
> + *Values on the stack at the time of the call may be used as parameters by
> + the called expression and values left on the stack by the called expression
> + may be used as return values by prior agreement between the calling and
> + called expressions.*
> +
> + * If D has a ``DW_AT_location`` attribute that is encoded as a ``loclist`` or
> + ``loclistsptr``, then the specified location list expression E is
> + evaluated, and the resulting location description is pushed on the stack.
> + The evaluation of E uses a context that has the same current frame and
> + current program location as the current operation expression, but an empty
> + initial stack.
> +
> + .. note::
> +
> + This rule avoids having to define how to execute a matched location list
> + entry operation expression on the same stack as the call when there are
> + multiple matches. But it allows the call to obtain the location
> + description for a variable or formal parameter which may use a location
> + list expression.
> +
> + An alternative is to treat the case when D has a ``DW_AT_location``
> + attribute that is encoded as a ``loclist`` or ``loclistsptr``, and the
> + specified location list expression E' matches a single location list
> + entry with operation expression E, the same as the ``exprloc`` case and
> + evaluate on the same stack.
> +
> + But this is not attractive as if the attribute is for a variable that
> + happens to end with a non-singleton stack, it will not simply put a
> + location description on the stack. Presumably the intent of using
> + ``DW_OP_call*`` on a variable or formal parameter debugger information
> + entry is to push just one location description on the stack. That
> + location description may have more than one single location description.
> +
> + The previous rule for ``exprloc`` also has the same problem as normally
> + a variable or formal parameter location expression may leave multiple
> + entries on the stack and only return the top entry.
> +
> + GDB implements ``DW_OP_call*`` by always executing E on the same stack.
> + If the location list has multiple matching entries, it simply picks the
> + first one and ignores the rest. This seems fundementally at odds with
> + the desire to supporting multiple places for variables.
> +
> + So, it feels like ``DW_OP_call*`` should both support pushing a location
> + description on the stack for a variable or formal parameter, and also
> + support being able to execute an operation expression on the same stack.
> + Being able to specify a
> diff erent operation expression for
> diff erent
> + program locations seems a desirable feature to retain.
> +
> + A solution to that is to have a distinct ``DW_AT_LLVM_proc`` attribute
> + for the ``DW_TAG_dwarf_procedure`` debugging information entry. Then the
> + ``DW_AT_location`` attribute expression is always executed separately
> + and pushes a location description (that may have multiple single
> + location descriptions), and the ``DW_AT_LLVM_proc`` attribute expression
> + is always executed on the same stack and can leave anything on the
> + stack.
> +
> + The ``DW_AT_LLVM_proc`` attribute could have the new classes
> + ``exprproc``, ``loclistproc``, and ``loclistsptrproc`` to indicate that
> + the expression is executed on the same stack. ``exprproc`` is the same
> + encoding as ``exprloc``. ``loclistproc`` and ``loclistsptrproc`` are the
> + same encoding as their non-\ ``proc`` counterparts except the DWARF is
> + ill-formed if the location list does not match exactly one location list
> + entry and a default entry is required. These forms indicate explicitly
> + that the matched single operation expression must be executed on the
> + same stack. This is better than ad hoc special rules for ``loclistproc``
> + and ``loclistsptrproc`` which are currently clearly defined to always
> + return a location description. The producer then explicitly indicates
> + the intent through the attribute classes.
> +
> + Such a change would be a breaking change for how GDB implements
> + ``DW_OP_call*``. However, are the breaking cases actually occurring in
> + practice? GDB could implement the current approach for DWARF Version 5,
> + and the new semantics for DWARF Version 6 which has been done for some
> + other features.
> +
> + Another option is to limit the execution to be on the same stack only to
> + the evaluation of an expression E that is the value of a
> + ``DW_AT_location`` attribute of a ``DW_TAG_dwarf_procedure`` debugging
> + information entry. The DWARF would be ill-formed if E is a location list
> + expression that does not match exactly one location list entry. In all
> + other cases the evaluation of an expression E that is the value of a
> + ``DW_AT_location`` attribute would evaluate E with a context that has
> + the same current frame and current program location as the current
> + operation expression, but an empty initial stack, and push the resulting
> + location description on the stack.
> +
> + * If D has a ``DW_AT_const_value`` attribute with a value V, then it is as
> + if a ``DW_OP_implicit_value V`` operation was executed.
> +
> + *This allows a call operation to be used to compute the location
> + description for any variable or formal parameter regardless of whether the
> + producer has optimized it to a constant. This is consistent with the
> + ``DW_OP_implicit_pointer`` operation.*
> +
> + .. note::
> +
> + Alternatively, could deprecate using ``DW_AT_const_value`` for
> + ``DW_TAG_variable`` and ``DW_TAG_formal_parameter`` debugger information
> + entries that are constants and instead use ``DW_AT_location`` with an
> + operation expression that results in a location description with one
> + implicit location description. Then this rule would not be required.
> +
> + * Otherwise, there is no effect and no changes are made to the stack.
> +
> + .. note::
> +
> + In DWARF Version 5, if D does not have a ``DW_AT_location`` then
> + ``DW_OP_call*`` is defined to have no effect. It is unclear that this is
> + the right definition as a producer should be able to rely on using
> + ``DW_OP_call*`` to get a location description for any non-\
> + ``DW_TAG_dwarf_procedure`` debugging information entries. Also, the
> + producer should not be creating DWARF with ``DW_OP_call*`` to a
> + ``DW_TAG_dwarf_procedure`` that does not have a ``DW_AT_location``
> + attribute. So, should this case be defined as an ill-formed DWARF
> + expression?
> +
> + *The* ``DW_TAG_dwarf_procedure`` *debugging information entry can be used to
> + define DWARF procedures that can be called.*
> +
> +.. _amdgpu-dwarf-value-operations:
> +
> +Value Operations
> +################
> +
> +This section describes the operations that push values on the stack.
> +
> +Each value stack entry has a type and a literal value and can represent a
> +literal value of any supported base type of the target architecture. The base
> +type specifies the size and encoding of the literal value.
> +
> +Instead of a base type, value stack entries can have a distinguished generic
> +type, which is an integral type that has the size of an address in the target
> +architecture default address space and unspecified signedness.
> +
> +*The generic type is the same as the unspecified type used for stack operations
> +defined in DWARF Version 4 and before.*
> +
> +An integral type is a base type that has an encoding of ``DW_ATE_signed``,
> +``DW_ATE_signed_char``, ``DW_ATE_unsigned``, ``DW_ATE_unsigned_char``,
> +``DW_ATE_boolean``, or any target architecture defined integral encoding in the
> +inclusive range ``DW_ATE_lo_user`` to ``DW_ATE_hi_user``.
> +
> +.. note::
> +
> + Unclear if ``DW_ATE_address`` is an integral type. GDB does not seem to
> + consider it as integral.
> +
> +.. _amdgpu-dwarf-literal-operations:
> +
> +Literal Operations
> +^^^^^^^^^^^^^^^^^^
> +
> +The following operations all push a literal value onto the DWARF stack.
> +
> +Operations other than ``DW_OP_const_type`` push a value V with the generic type.
> +If V is larger than the generic type, then V is truncated to the generic type
> +size and the low-order bits used.
> +
> +1. ``DW_OP_lit0``, ``DW_OP_lit1``, ..., ``DW_OP_lit31``
> +
> + ``DW_OP_lit<N>`` operations encode an unsigned literal value N from 0
> + through 31, inclusive. They push the value N with the generic type.
> +
> +2. ``DW_OP_const1u``, ``DW_OP_const2u``, ``DW_OP_const4u``, ``DW_OP_const8u``
> +
> + ``DW_OP_const<N>u`` operations have a single operand that is a 1, 2, 4, or
> + 8-byte unsigned integer constant U, respectively. They push the value U with
> + the generic type.
> +
> +3. ``DW_OP_const1s``, ``DW_OP_const2s``, ``DW_OP_const4s``, ``DW_OP_const8s``
> +
> + ``DW_OP_const<N>s`` operations have a single operand that is a 1, 2, 4, or
> + 8-byte signed integer constant S, respectively. They push the value S with
> + the generic type.
> +
> +4. ``DW_OP_constu``
> +
> + ``DW_OP_constu`` has a single unsigned LEB128 integer operand N. It pushes
> + the value N with the generic type.
> +
> +5. ``DW_OP_consts``
> +
> + ``DW_OP_consts`` has a single signed LEB128 integer operand N. It pushes the
> + value N with the generic type.
> +
> +6. ``DW_OP_constx``
> +
> + ``DW_OP_constx`` has a single unsigned LEB128 integer operand that
> + represents a zero-based index into the ``.debug_addr`` section relative to
> + the value of the ``DW_AT_addr_base`` attribute of the associated compilation
> + unit. The value N in the ``.debug_addr`` section has the size of the generic
> + type. It pushes the value N with the generic type.
> +
> + *The* ``DW_OP_constx`` *operation is provided for constants that require
> + link-time relocation but should not be interpreted by the consumer as a
> + relocatable address (for example, offsets to thread-local storage).*
> +
> +9. ``DW_OP_const_type``
> +
> + ``DW_OP_const_type`` has three operands. The first is an unsigned LEB128
> + integer that represents the offset of a debugging information entry D in the
> + current compilation unit, that provides the type of the constant value. The
> + second is a 1-byte unsigned integral constant S. The third is a block of
> + bytes B, with a length equal to S.
> +
> + T is the bit size of the type D. The least significant T bits of B are
> + interpreted as a value V of the type D. It pushes the value V with the type
> + D.
> +
> + The DWARF is ill-formed if D is not a ``DW_TAG_base_type`` debugging
> + information entry, or if T divided by 8 and rounded up to a multiple of 8
> + (the byte size) is not equal to S.
> +
> + *While the size of the byte block B can be inferred from the type D
> + definition, it is encoded explicitly into the operation so that the
> + operation can be parsed easily without reference to the* ``.debug_info``
> + *section.*
> +
> +10. ``DW_OP_LLVM_push_lane`` *New*
> +
> + ``DW_OP_LLVM_push_lane`` pushes a value with the generic type that is the
> + target architecture specific lane identifier of the thread of execution for
> + which a user presented expression is currently being evaluated.
> +
> + *For languages that are implemented using a SIMD or SIMT execution model,
> + this is the lane number that corresponds to the source language thread of
> + execution upon which the user is focused.*
> +
> +.. _amdgpu-dwarf-arithmetic-logical-operations:
> +
> +Arithmetic and Logical Operations
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +.. note::
> +
> + This section is the same as DWARF Version 5 section 2.5.1.4.
> +
> +.. _amdgpu-dwarf-type-conversions-operations:
> +
> +Type Conversion Operations
> +^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +.. note::
> +
> + This section is the same as DWARF Version 5 section 2.5.1.6.
> +
> +.. _amdgpu-dwarf-general-operations:
> +
> +Special Value Operations
> +^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +There are these special value operations currently defined:
> +
> +1. ``DW_OP_regval_type``
> +
> + ``DW_OP_regval_type`` has two operands. The first is an unsigned LEB128
> + integer that represents a register number R. The second is an unsigned
> + LEB128 integer that represents the offset of a debugging information entry D
> + in the current compilation unit, that provides the type of the register
> + value.
> +
> + The contents of register R are interpreted as a value V of the type D. The
> + value V is pushed on the stack with the type D.
> +
> + The DWARF is ill-formed if D is not a ``DW_TAG_base_type`` debugging
> + information entry, or if the size of type D is not the same as the size of
> + register R.
> +
> + .. note::
> +
> + Should DWARF allow the type D to be a
> diff erent size to the size of the
> + register R? Requiring them to be the same bit size avoids any issue of
> + conversion as the bit contents of the register is simply interpreted as a
> + value of the specified type. If a conversion is wanted it can be done
> + explicitly using a ``DW_OP_convert`` operation.
> +
> + GDB has a per register hook that allows a target specific conversion on a
> + register by register basis. It defaults to truncation of bigger registers,
> + and to actually reading bytes from the next register (or reads out of
> + bounds for the last register) for smaller registers. There are no GDB
> + tests that read a register out of bounds (except an illegal hand written
> + assembly test).
> +
> +2. ``DW_OP_deref``
> +
> + The ``DW_OP_deref`` operation pops one stack entry that must be a location
> + description L.
> +
> + A value of the bit size of the generic type is retrieved from the location
> + storage specified by L. The value V retrieved is pushed on the stack with
> + the generic type.
> +
> + If any bit of the value is retrieved from the undefined location storage, or
> + the offset of any bit exceeds the size of the location storage specified by
> + L, then the DWARF expression is ill-formed.
> +
> + See :ref:`amdgpu-dwarf-implicit-location-descriptions` for special rules
> + concerning implicit location descriptions created by the
> + ``DW_OP_implicit_pointer`` and ``DW_OP_LLVM_implicit_aspace_pointer``
> + operations.
> +
> + *If L, or the location description of any composite location description
> + part that is a subcomponent of L, has more than one single location
> + description, then any one of them can be selected as they are required to
> + all have the same value. For any single location description SL, bits are
> + retrieved from the associated storage location starting at the bit offset
> + specified by SL. For a composite location description, the retrieved bits
> + are the concatenation of the N bits from each composite location part PL,
> + where N is limited to the size of PL.*
> +
> +3. ``DW_OP_deref_size``
> +
> + ``DW_OP_deref_size`` has a single 1-byte unsigned integral constant that
> + represents a byte result size S.
> +
> + It pops one stack entry that must be a location description L.
> +
> + T is the smaller of the generic type size and S scaled by 8 (the byte size).
> + A value V of T bits is retrieved from the location storage specified by L.
> + If V is smaller than the size of the generic type, V is zero-extended to the
> + generic type size. V is pushed onto the stack with the generic type.
> +
> + The DWARF expression is ill-formed if any bit of the value is retrieved from
> + the undefined location storage, or if the offset of any bit exceeds the size
> + of the location storage specified by L.
> +
> + .. note::
> +
> + Truncating the value when S is larger than the generic type matches what
> + GDB does. This allows the generic type size to not be a integral byte
> + size. It does allow S to be arbitrarily large. Should S be restricted to
> + the size of the generic type rounded up to a multiple of 8?
> +
> + See :ref:`amdgpu-dwarf-implicit-location-descriptions` for special rules
> + concerning implicit location descriptions created by the
> + ``DW_OP_implicit_pointer`` and ``DW_OP_LLVM_implicit_aspace_pointer``
> + operations.
> +
> +4. ``DW_OP_deref_type``
> +
> + ``DW_OP_deref_type`` has two operands. The first is a 1-byte unsigned
> + integral constant S. The second is an unsigned LEB128 integer that
> + represents the offset of a debugging information entry D in the current
> + compilation unit, that provides the type of the result value.
> +
> + It pops one stack entry that must be a location description L. T is the bit
> + size of the type D. A value V of T bits is retrieved from the location
> + storage specified by L. V is pushed on the stack with the type D.
> +
> + The DWARF is ill-formed if D is not a ``DW_TAG_base_type`` debugging
> + information entry, if T divided by 8 and rounded up to a multiple of 8 (the
> + byte size) is not equal to S, if any bit of the value is retrieved from the
> + undefined location storage, or if the offset of any bit exceeds the size of
> + the location storage specified by L.
> +
> + See :ref:`amdgpu-dwarf-implicit-location-descriptions` for special rules
> + concerning implicit location descriptions created by the
> + ``DW_OP_implicit_pointer`` and ``DW_OP_LLVM_implicit_aspace_pointer``
> + operations.
> +
> + *While the size of the pushed value V can be inferred from the type D
> + definition, it is encoded explicitly into the operation so that the
> + operation can be parsed easily without reference to the* ``.debug_info``
> + *section.*
> +
> + .. note::
> +
> + It is unclear why the operand S is needed. Unlike ``DW_OP_const_type``,
> + the size is not needed for parsing. Any evaluation needs to get the base
> + type to record with the value to know its encoding and bit size.
> +
> + This definition allows the base type to be a bit size since there seems no
> + reason to restrict it.
> +
> +5. ``DW_OP_xderef`` *Deprecated*
> +
> + ``DW_OP_xderef`` pops two stack entries. The first must be an integral type
> + value that represents an address A. The second must be an integral type
> + value that represents a target architecture specific address space
> + identifier AS.
> +
> + The operation is equivalent to performing ``DW_OP_swap;
> + DW_OP_LLVM_form_aspace_address; DW_OP_deref``. The value V retrieved is left
> + on the stack with the generic type.
> +
> + *This operation is deprecated as the* ``DW_OP_LLVM_form_aspace_address``
> + operation can be used and provides greater expressiveness.*
> +
> +6. ``DW_OP_xderef_size`` *Deprecated*
> +
> + ``DW_OP_xderef_size`` has a single 1-byte unsigned integral constant that
> + represents a byte result size S.
> +
> + It pops two stack entries. The first must be an integral type value that
> + represents an address A. The second must be an integral type value that
> + represents a target architecture specific address space identifier AS.
> +
> + The operation is equivalent to performing ``DW_OP_swap;
> + DW_OP_LLVM_form_aspace_address; DW_OP_deref_size S``. The zero-extended
> + value V retrieved is left on the stack with the generic type.
> +
> + *This operation is deprecated as the* ``DW_OP_LLVM_form_aspace_address``
> + operation can be used and provides greater expressiveness.*
> +
> +7. ``DW_OP_xderef_type`` *Deprecated*
> +
> + ``DW_OP_xderef_type`` has two operands. The first is a 1-byte unsigned
> + integral constant S. The second operand is an unsigned LEB128
> + integer R that represents the offset of a debugging information entry D in
> + the current compilation unit, that provides the type of the result value.
> +
> + It pops two stack entries. The first must be an integral type value that
> + represents an address A. The second must be an integral type value that
> + represents a target architecture specific address space identifier AS.
> +
> + The operation is equivalent to performing ``DW_OP_swap;
> + DW_OP_LLVM_form_aspace_address; DW_OP_deref_type S R``. The value V
> + retrieved is left on the stack with the type D.
> +
> + *This operation is deprecated as the* ``DW_OP_LLVM_form_aspace_address``
> + operation can be used and provides greater expressiveness.*
> +
> +8. ``DW_OP_entry_value`` *Deprecated*
> +
> + ``DW_OP_entry_value`` pushes the value that the described location held upon
> + entering the current subprogram.
> +
> + It has two operands. The first is an unsigned LEB128 integer S. The second
> + is a block of bytes, with a length equal S, interpreted as a DWARF
> + operation expression E.
> +
> + E is evaluated as if it had been evaluated upon entering the current
> + subprogram with an empty initial stack.
> +
> + .. note::
> +
> + It is unclear what this means. What is the current program location and
> + current frame that must be used? Does this require reverse execution so
> + the register and memory state are as it was on entry to the current
> + subprogram?
> +
> + The DWARF expression is ill-formed if the evaluation of E executes a
> + ``DW_OP_push_object_address`` operation.
> +
> + If the result of E is a location description with one register location
> + description (see :ref:`amdgpu-dwarf-register-location-descriptions`),
> + ``DW_OP_entry_value`` pushes the value that register had upon entering the
> + current subprogram. The value entry type is the target architecture register
> + base type. If the register value is undefined or the register location
> + description bit offset is not 0, then the DWARF expression is ill-formed.
> +
> + *The register location description provides a more compact form for the case
> + where the value was in a register on entry to the subprogram.*
> +
> + If the result of E is a value V, ``DW_OP_entry_value`` pushes V on the
> + stack.
> +
> + Otherwise, the DWARF expression is ill-formed.
> +
> + *The values needed to evaluate* ``DW_OP_entry_value`` *could be obtained in
> + several ways. The consumer could suspend execution on entry to the
> + subprogram, record values needed by* ``DW_OP_entry_value`` *expressions
> + within the subprogram, and then continue. When evaluating*
> + ``DW_OP_entry_value``\ *, the consumer would use these recorded values
> + rather than the current values. Or, when evaluating* ``DW_OP_entry_value``\
> + *, the consumer could virtually unwind using the Call Frame Information
> + (see* :ref:`amdgpu-dwarf-call-frame-information`\ *) to recover register
> + values that might have been clobbered since the subprogram entry point.*
> +
> + *The* ``DW_OP_entry_value`` *operation is deprecated as its main usage is
> + provided by other means. DWARF Version 5 added the*
> + ``DW_TAG_call_site_parameter`` *debugger information entry for call sites
> + that has* ``DW_AT_call_value``\ *,* ``DW_AT_call_data_location``\ *, and*
> + ``DW_AT_call_data_value`` *attributes that provide DWARF expressions to
> + compute actual parameter values at the time of the call, and requires the
> + producer to ensure the expressions are valid to evaluate even when virtually
> + unwound. The* ``DW_OP_LLVM_call_frame_entry_reg`` *operation provides access
> + to registers in the virtually unwound calling frame.*
> +
> + .. note::
> +
> + It is unclear why this operation is defined this way. How would a consumer
> + know what values have to be saved on entry to the subprogram? Does it have
> + to parse every expression of every ``DW_OP_entry_value`` operation to
> + capture all the possible results needed? Or does it have to implement
> + reverse execution so it can evaluate the expression in the context of the
> + entry of the subprogram so it can obtain the entry point register and
> + memory values? Or does the compiler somehow instruct the consumer how to
> + create the saved copies of the variables on entry?
> +
> + If the expression is simply using existing variables, then it is just a
> + regular expression and no special operation is needed. If the main purpose
> + is only to read the entry value of a register using CFI then it would be
> + better to have an operation that explicitly does just that such as the
> + proposed ``DW_OP_LLVM_call_frame_entry_reg`` operation.
> +
> + GDB only seems to implement ``DW_OP_entry_value`` when E is exactly
> + ``DW_OP_reg*`` or ``DW_OP_breg*; DW_OP_deref*``. It evaluates E in the
> + context of the calling subprogram and the calling call site program
> + location. But the wording suggests that is not the intention.
> +
> + Given these issues it is suggested ``DW_OP_entry_value`` is deprecated in
> + favor of using the new facities that have well defined semantics and
> + implementations.
> +
> +.. _amdgpu-dwarf-location-description-operations:
> +
> +Location Description Operations
> +###############################
> +
> +This section describes the operations that push location descriptions on the
> +stack.
> +
> +General Location Description Operations
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +1. ``DW_OP_LLVM_offset`` *New*
> +
> + ``DW_OP_LLVM_offset`` pops two stack entries. The first must be an integral
> + type value that represents a byte displacement B. The second must be a
> + location description L.
> +
> + It adds the value of B scaled by 8 (the byte size) to the bit offset of each
> + single location description SL of L, and pushes the updated L.
> +
> + If the updated bit offset of any SL is less than 0 or greater than or equal
> + to the size of the location storage specified by SL, then the DWARF
> + expression is ill-formed.
> +
> +2. ``DW_OP_LLVM_offset_constu`` *New*
> +
> + ``DW_OP_LLVM_offset_constu`` has a single unsigned LEB128 integer operand
> + that represents a byte displacement B.
> +
> + The operation is equivalent to performing ``DW_OP_constu B;
> + DW_OP_LLVM_offset``.
> +
> + *This operation is supplied specifically to be able to encode more field
> + displacements in two bytes than can be done with* ``DW_OP_lit*;
> + DW_OP_LLVM_offset``\ *.*
> +
> +3. ``DW_OP_LLVM_bit_offset`` *New*
> +
> + ``DW_OP_LLVM_bit_offset`` pops two stack entries. The first must be an
> + integral type value that represents a bit displacement B. The second must be
> + a location description L.
> +
> + It adds the value of B to the bit offset of each single location description
> + SL of L, and pushes the updated L.
> +
> + If the updated bit offset of any SL is less than 0 or greater than or equal
> + to the size of the location storage specified by SL, then the DWARF
> + expression is ill-formed.
> +
> +4. ``DW_OP_push_object_address``
> +
> + ``DW_OP_push_object_address`` pushes the location description L of the
> + object currently being evaluated as part of evaluation of a user presented
> + expression.
> +
> + This object may correspond to an independent variable described by its own
> + debugging information entry or it may be a component of an array, structure,
> + or class whose address has been dynamically determined by an earlier step
> + during user expression evaluation.
> +
> + *This operation provides explicit functionality (especially for arrays
> + involving descriptions) that is analogous to the implicit push of the base
> + location description of a structure prior to evaluation of a
> + ``DW_AT_data_member_location`` to access a data member of a structure.*
> +
> +5. ``DW_OP_LLVM_call_frame_entry_reg`` *New*
> +
> + ``DW_OP_LLVM_call_frame_entry_reg`` has a single unsigned LEB128 integer
> + operand that represents a target architecture register number R.
> +
> + It pushes a location description L that holds the value of register R on
> + entry to the current subprogram as defined by the Call Frame Information
> + (see :ref:`amdgpu-dwarf-call-frame-information`).
> +
> + *If there is no Call Frame Information defined, then the default rules for
> + the target architecture are used. If the register rule is* undefined\ *, then
> + the undefined location description is pushed. If the register rule is* same
> + value\ *, then a register location description for R is pushed.*
> +
> +Undefined Location Description Operations
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +*The undefined location storage represents a piece or all of an object that is
> +present in the source but not in the object code (perhaps due to optimization).
> +Neither reading nor writing to the undefined location storage is meaningful.*
> +
> +An undefined location description specifies the undefined location storage.
> +There is no concept of the size of the undefined location storage, nor of a bit
> +offset for an undefined location description. The ``DW_OP_LLVM_*offset``
> +operations leave an undefined location description unchanged. The
> +``DW_OP_*piece`` operations can explicitly or implicitly specify an undefined
> +location description, allowing any size and offset to be specified, and results
> +in a part with all undefined bits.
> +
> +1. ``DW_OP_LLVM_undefined`` *New*
> +
> + ``DW_OP_LLVM_undefined`` pushes a location description L that comprises one
> + undefined location description SL.
> +
> +.. _amdgpu-dwarf-memory-location-description-operations:
> +
> +Memory Location Description Operations
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Each of the target architecture specific address spaces has a corresponding
> +memory location storage that denotes the linear addressable memory of that
> +address space. The size of each memory location storage corresponds to the range
> +of the addresses in the corresponding address space.
> +
> +*It is target architecture defined how address space location storage maps to
> +target architecture physical memory. For example, they may be independent
> +memory, or more than one location storage may alias the same physical memory
> +possibly at
> diff erent offsets and with
> diff erent interleaving. The mapping may
> +also be dictated by the source language address classes.*
> +
> +A memory location description specifies a memory location storage. The bit
> +offset corresponds to a bit position within a byte of the memory. Bits accessed
> +using a memory location description, access the corresponding target
> +architecture memory starting at the bit position within the byte specified by
> +the bit offset.
> +
> +A memory location description that has a bit offset that is a multiple of 8 (the
> +byte size) is defined to be a byte address memory location description. It has a
> +memory byte address A that is equal to the bit offset divided by 8.
> +
> +A memory location description that does not have a bit offset that is a multiple
> +of 8 (the byte size) is defined to be a bit field memory location description.
> +It has a bit position B equal to the bit offset modulo 8, and a memory byte
> +address A equal to the bit offset minus B that is then divided by 8.
> +
> +The address space AS of a memory location description is defined to be the
> +address space that corresponds to the memory location storage associated with
> +the memory location description.
> +
> +A location description that is comprised of one byte address memory location
> +description SL is defined to be a memory byte address location description. It
> +has a byte address equal to A and an address space equal to AS of the
> +corresponding SL.
> +
> +``DW_ASPACE_none`` is defined as the target architecture default address space.
> +
> +If a stack entry is required to be a location description, but it is a value V
> +with the generic type, then it is implicitly converted to a location description
> +L with one memory location description SL. SL specifies the memory location
> +storage that corresponds to the target architecture default address space with a
> +bit offset equal to V scaled by 8 (the byte size).
> +
> +.. note::
> +
> + If it is wanted to allow any integral type value to be implicitly converted to
> + a memory location description in the target architecture default address
> + space:
> +
> + If a stack entry is required to be a location description, but is a value V
> + with an integral type, then it is implicitly converted to a location
> + description L with a one memory location description SL. If the type size of
> + V is less than the generic type size, then the value V is zero extended to
> + the size of the generic type. The least significant generic type size bits
> + are treated as a twos-complement unsigned value to be used as an address A.
> + SL specifies memory location storage corresponding to the target
> + architecture default address space with a bit offset equal to A scaled by 8
> + (the byte size).
> +
> + The implicit conversion could also be defined as target architecture specific.
> + For example, GDB checks if V is an integral type. If it is not it gives an
> + error. Otherwise, GDB zero-extends V to 64 bits. If the GDB target defines a
> + hook function, then it is called. The target specific hook function can modify
> + the 64-bit value, possibly sign extending based on the original value type.
> + Finally, GDB treats the 64-bit value V as a memory location address.
> +
> +If a stack entry is required to be a location description, but it is an implicit
> +pointer value IPV with the target architecture default address space, then it is
> +implicitly converted to a location description with one single location
> +description specified by IPV. See
> +:ref:`amdgpu-dwarf-implicit-location-descriptions`.
> +
> +.. note::
> +
> + Is this rule required for DWARF Version 5 backwards compatibility? If not, it
> + can be eliminated, and the producer can use
> + ``DW_OP_LLVM_form_aspace_address``.
> +
> +If a stack entry is required to be a value, but it is a location description L
> +with one memory location description SL in the target architecture default
> +address space with a bit offset B that is a multiple of 8, then it is implicitly
> +converted to a value equal to B divided by 8 (the byte size) with the generic
> +type.
> +
> +1. ``DW_OP_addr``
> +
> + ``DW_OP_addr`` has a single byte constant value operand, which has the size
> + of the generic type, that represents an address A.
> +
> + It pushes a location description L with one memory location description SL
> + on the stack. SL specifies the memory location storage corresponding to the
> + target architecture default address space with a bit offset equal to A
> + scaled by 8 (the byte size).
> +
> + *If the DWARF is part of a code object, then A may need to be relocated. For
> + example, in the ELF code object format, A must be adjusted by the
> diff erence
> + between the ELF segment virtual address and the virtual address at which the
> + segment is loaded.*
> +
> +2. ``DW_OP_addrx``
> +
> + ``DW_OP_addrx`` has a single unsigned LEB128 integer operand that represents
> + a zero-based index into the ``.debug_addr`` section relative to the value of
> + the ``DW_AT_addr_base`` attribute of the associated compilation unit. The
> + address value A in the ``.debug_addr`` section has the size of the generic
> + type.
> +
> + It pushes a location description L with one memory location description SL
> + on the stack. SL specifies the memory location storage corresponding to the
> + target architecture default address space with a bit offset equal to A
> + scaled by 8 (the byte size).
> +
> + *If the DWARF is part of a code object, then A may need to be relocated. For
> + example, in the ELF code object format, A must be adjusted by the
> diff erence
> + between the ELF segment virtual address and the virtual address at which the
> + segment is loaded.*
> +
> +3. ``DW_OP_LLVM_form_aspace_address`` *New*
> +
> + ``DW_OP_LLVM_form_aspace_address`` pops top two stack entries. The first
> + must be an integral type value that represents a target architecture
> + specific address space identifier AS. The second must be an integral type
> + value that represents an address A.
> +
> + The address size S is defined as the address bit size of the target
> + architecture specific address space that corresponds to AS.
> +
> + A is adjusted to S bits by zero extending if necessary, and then treating the
> + least significant S bits as a twos-complement unsigned value A'.
> +
> + It pushes a location description L with one memory location description SL
> + on the stack. SL specifies the memory location storage that corresponds to
> + AS with a bit offset equal to A' scaled by 8 (the byte size).
> +
> + The DWARF expression is ill-formed if AS is not one of the values defined by
> + the target architecture specific ``DW_ASPACE_*`` values.
> +
> + See :ref:`amdgpu-dwarf-implicit-location-descriptions` for special rules
> + concerning implicit pointer values produced by dereferencing implicit
> + location descriptions created by the ``DW_OP_implicit_pointer`` and
> + ``DW_OP_LLVM_implicit_aspace_pointer`` operations.
> +
> +4. ``DW_OP_form_tls_address``
> +
> + ``DW_OP_form_tls_address`` pops one stack entry that must be an integral
> + type value and treats it as a thread-local storage address T.
> +
> + It pushes a location description L with one memory location description SL
> + on the stack. SL is the target architecture specific memory location
> + description that corresponds to the thread-local storage address T.
> +
> + The meaning of the thread-local storage address T is defined by the run-time
> + environment. If the run-time environment supports multiple thread-local
> + storage blocks for a single thread, then the block corresponding to the
> + executable or shared library containing this DWARF expression is used.
> +
> + *Some implementations of C, C++, Fortran, and other languages support a
> + thread-local storage class. Variables with this storage class have distinct
> + values and addresses in distinct threads, much as automatic variables have
> + distinct values and addresses in each subprogram invocation. Typically,
> + there is a single block of storage containing all thread-local variables
> + declared in the main executable, and a separate block for the variables
> + declared in each shared library. Each thread-local variable can then be
> + accessed in its block using an identifier. This identifier is typically a
> + byte offset into the block and pushed onto the DWARF stack by one of the*
> + ``DW_OP_const*`` *operations prior to the* ``DW_OP_form_tls_address``
> + *operation. Computing the address of the appropriate block can be complex
> + (in some cases, the compiler emits a function call to do it), and
> diff icult
> + to describe using ordinary DWARF location descriptions. Instead of forcing
> + complex thread-local storage calculations into the DWARF expressions, the*
> + ``DW_OP_form_tls_address`` *allows the consumer to perform the computation
> + based on the target architecture specific run-time environment.*
> +
> +5. ``DW_OP_call_frame_cfa``
> +
> + ``DW_OP_call_frame_cfa`` pushes the location description L of the Canonical
> + Frame Address (CFA) of the current subprogram, obtained from the Call Frame
> + Information on the stack. See :ref:`amdgpu-dwarf-call-frame-information`.
> +
> + *Although the value of the* ``DW_AT_frame_base`` *attribute of the debugger
> + information entry corresponding to the current subprogram can be computed
> + using a location list expression, in some cases this would require an
> + extensive location list because the values of the registers used in
> + computing the CFA change during a subprogram execution. If the Call Frame
> + Information is present, then it already encodes such changes, and it is
> + space efficient to reference that using the* ``DW_OP_call_frame_cfa``
> + *operation.*
> +
> +6. ``DW_OP_fbreg``
> +
> + ``DW_OP_fbreg`` has a single signed LEB128 integer operand that represents a
> + byte displacement B.
> +
> + The location description L for the *frame base* of the current subprogram is
> + obtained from the ``DW_AT_frame_base`` attribute of the debugger information
> + entry corresponding to the current subprogram as described in
> + :ref:`amdgpu-dwarf-debugging-information-entry-attributes`.
> +
> + The location description L is updated as if the ``DW_OP_LLVM_offset_constu
> + B`` operation was applied. The updated L is pushed on the stack.
> +
> +7. ``DW_OP_breg0``, ``DW_OP_breg1``, ..., ``DW_OP_breg31``
> +
> + The ``DW_OP_breg<N>`` operations encode the numbers of up to 32 registers,
> + numbered from 0 through 31, inclusive. The register number R corresponds to
> + the N in the operation name.
> +
> + They have a single signed LEB128 integer operand that represents a byte
> + displacement B.
> +
> + The address space identifier AS is defined as the one corresponding to the
> + target architecture specific default address space.
> +
> + The address size S is defined as the address bit size of the target
> + architecture specific address space corresponding to AS.
> +
> + The contents of the register specified by R are retrieved as a
> + twos-complement unsigned value and zero extended to S bits. B is added and
> + the least significant S bits are treated as a twos-complement unsigned value
> + to be used as an address A.
> +
> + They push a location description L comprising one memory location
> + description LS on the stack. LS specifies the memory location storage that
> + corresponds to AS with a bit offset equal to A scaled by 8 (the byte size).
> +
> +8. ``DW_OP_bregx``
> +
> + ``DW_OP_bregx`` has two operands. The first is an unsigned LEB128 integer
> + that represents a register number R. The second is a signed LEB128
> + integer that represents a byte displacement B.
> +
> + The action is the same as for ``DW_OP_breg<N>`` except that R is used as the
> + register number and B is used as the byte displacement.
> +
> +9. ``DW_OP_LLVM_aspace_bregx`` *New*
> +
> + ``DW_OP_LLVM_aspace_bregx`` has two operands. The first is an unsigned
> + LEB128 integer that represents a register number R. The second is a signed
> + LEB128 integer that represents a byte displacement B. It pops one stack
> + entry that is required to be an integral type value that represents a target
> + architecture specific address space identifier AS.
> +
> + The action is the same as for ``DW_OP_breg<N>`` except that R is used as the
> + register number, B is used as the byte displacement, and AS is used as the
> + address space identifier.
> +
> + The DWARF expression is ill-formed if AS is not one of the values defined by
> + the target architecture specific ``DW_ASPACE_*`` values.
> +
> + .. note::
> +
> + Could also consider adding ``DW_OP_aspace_breg0, DW_OP_aspace_breg1, ...,
> + DW_OP_aspace_bref31`` which would save encoding size.
> +
> +.. _amdgpu-dwarf-register-location-descriptions:
> +
> +Register Location Description Operations
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +There is a register location storage that corresponds to each of the target
> +architecture registers. The size of each register location storage corresponds
> +to the size of the corresponding target architecture register.
> +
> +A register location description specifies a register location storage. The bit
> +offset corresponds to a bit position within the register. Bits accessed using a
> +register location description access the corresponding target architecture
> +register starting at the specified bit offset.
> +
> +1. ``DW_OP_reg0``, ``DW_OP_reg1``, ..., ``DW_OP_reg31``
> +
> + ``DW_OP_reg<N>`` operations encode the numbers of up to 32 registers,
> + numbered from 0 through 31, inclusive. The target architecture register
> + number R corresponds to the N in the operation name.
> +
> + They push a location description L that specifies one register location
> + description SL on the stack. SL specifies the register location storage that
> + corresponds to R with a bit offset of 0.
> +
> +2. ``DW_OP_regx``
> +
> + ``DW_OP_regx`` has a single unsigned LEB128 integer operand that represents
> + a target architecture register number R.
> +
> + It pushes a location description L that specifies one register location
> + description SL on the stack. SL specifies the register location storage that
> + corresponds to R with a bit offset of 0.
> +
> +*These operations obtain a register location. To fetch the contents of a
> +register, it is necessary to use* ``DW_OP_regval_type``\ *, use one of the*
> +``DW_OP_breg*`` *register-based addressing operations, or use* ``DW_OP_deref*``
> +*on a register location description.*
> +
> +.. _amdgpu-dwarf-implicit-location-descriptions:
> +
> +Implicit Location Description Operations
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Implicit location storage represents a piece or all of an object which has no
> +actual location in the program but whose contents are nonetheless known, either
> +as a constant or can be computed from other locations and values in the program.
> +
> +An implicit location description specifies an implicit location storage. The bit
> +offset corresponds to a bit position within the implicit location storage. Bits
> +accessed using an implicit location description, access the corresponding
> +implicit storage value starting at the bit offset.
> +
> +1. ``DW_OP_implicit_value``
> +
> + ``DW_OP_implicit_value`` has two operands. The first is an unsigned LEB128
> + integer that represents a byte size S. The second is a block of bytes with a
> + length equal to S treated as a literal value V.
> +
> + An implicit location storage LS is created with the literal value V and a
> + size of S.
> +
> + It pushes location description L with one implicit location description SL
> + on the stack. SL specifies LS with a bit offset of 0.
> +
> +2. ``DW_OP_stack_value``
> +
> + ``DW_OP_stack_value`` pops one stack entry that must be a value V.
> +
> + An implicit location storage LS is created with the literal value V and a
> + size equal to V's base type size.
> +
> + It pushes a location description L with one implicit location description SL
> + on the stack. SL specifies LS with a bit offset of 0.
> +
> + *The* ``DW_OP_stack_value`` *operation specifies that the object does not
> + exist in memory, but its value is nonetheless known. In this form, the
> + location description specifies the actual value of the object, rather than
> + specifying the memory or register storage that holds the value.*
> +
> + See :ref:`amdgpu-dwarf-implicit-location-descriptions` for special rules
> + concerning implicit pointer values produced by dereferencing implicit
> + location descriptions created by the ``DW_OP_implicit_pointer`` and
> + ``DW_OP_LLVM_implicit_aspace_pointer`` operations.
> +
> + .. note::
> +
> + Since location descriptions are allowed on the stack, the
> + ``DW_OP_stack_value`` operation no longer terminates the DWARF operation
> + expression execution as in DWARF Version 5.
> +
> +3. ``DW_OP_implicit_pointer``
> +
> + *An optimizing compiler may eliminate a pointer, while still retaining the
> + value that the pointer addressed.* ``DW_OP_implicit_pointer`` *allows a
> + producer to describe this value.*
> +
> + ``DW_OP_implicit_pointer`` *specifies an object is a pointer to the target
> + architecture default address space that cannot be represented as a real
> + pointer, even though the value it would point to can be described. In this
> + form, the location description specifies a debugging information entry that
> + represents the actual location description of the object to which the
> + pointer would point. Thus, a consumer of the debug information would be able
> + to access the dereferenced pointer, even when it cannot access the pointer
> + itself.*
> +
> + ``DW_OP_implicit_pointer`` has two operands. The first is a 4-byte unsigned
> + value in the 32-bit DWARF format, or an 8-byte unsigned value in the 64-bit
> + DWARF format, that represents a debugging information entry reference R. The
> + second is a signed LEB128 integer that represents a byte displacement B.
> +
> + R is used as the offset of a debugging information entry D in a
> + ``.debug_info`` section, which may be contained in an executable or shared
> + object file other than that containing the operation. For references from one
> + executable or shared object file to another, the relocation must be
> + performed by the consumer.
> +
> + *The first operand interpretation is exactly like that for*
> + ``DW_FORM_ref_addr``\ *.*
> +
> + The address space identifier AS is defined as the one corresponding to the
> + target architecture specific default address space.
> +
> + The address size S is defined as the address bit size of the target
> + architecture specific address space corresponding to AS.
> +
> + An implicit location storage LS is created with the debugging information
> + entry D, address space AS, and size of S.
> +
> + It pushes a location description L that comprises one implicit location
> + description SL on the stack. SL specifies LS with a bit offset of 0.
> +
> + If a ``DW_OP_deref*`` operation pops a location description L', and
> + retrieves S bits where both:
> +
> + 1. All retrieved bits come from an implicit location description that
> + refers to an implicit location storage that is the same as LS.
> +
> + *Note that all bits do not have to come from the same implicit location
> + description, as L' may involve composite location descriptors.*
> +
> + 2. The bits come from consecutive ascending offsets within their respective
> + implicit location storage.
> +
> + *These rules are equivalent to retrieving the complete contents of LS.*
> +
> + Then the value V pushed by the ``DW_OP_deref*`` operation is an implicit
> + pointer value IPV with a target architecture specific address space of AS, a
> + debugging information entry of D, and a base type of T. If AS is the target
> + architecture default address space, then T is the generic type. Otherwise, T
> + is a target architecture specific integral type with a bit size equal to S.
> +
> + Otherwise, if a ``DW_OP_deref*`` operation is applied to a location
> + description such that some retrieved bits come from an implicit location
> + storage that is the same as LS, then the DWARF expression is ill-formed.
> +
> + If IPV is either implicitly converted to a location description (only done
> + if AS is the target architecture default address space) or used by
> + ``DW_OP_LLVM_form_aspace_address`` (only done if the address space specified
> + is AS), then the resulting location description RL is:
> +
> + * If D has a ``DW_AT_location`` attribute, the DWARF expression E from the
> + ``DW_AT_location`` attribute is evaluated as a location description. The
> + current subprogram and current program location of the evaluation context
> + that is accessing IPV is used for the evaluation context of E, together
> + with an empty initial stack. RL is the expression result.
> +
> + * If D has a ``DW_AT_const_value`` attribute, then an implicit location
> + storage RLS is created from the ``DW_AT_const_value`` attribute's value
> + with a size matching the size of the ``DW_AT_const_value`` attribute's
> + value. RL comprises one implicit location description SRL. SRL specifies
> + RLS with a bit offset of 0.
> +
> + .. note::
> +
> + If using ``DW_AT_const_value`` for variables and formal parameters is
> + deprecated and instead ``DW_AT_location`` is used with an implicit
> + location description, then this rule would not be required.
> +
> + * Otherwise the DWARF expression is ill-formed.
> +
> + The bit offset of RL is updated as if the ``DW_OP_LLVM_offset_constu B``
> + operation was applied.
> +
> + If a ``DW_OP_stack_value`` operation pops a value that is the same as IPV,
> + then it pushes a location description that is the same as L.
> +
> + The DWARF expression is ill-formed if it accesses LS or IPV in any other
> + manner.
> +
> + *The restrictions on how an implicit pointer location description created
> + by* ``DW_OP_implicit_pointer`` *and* ``DW_OP_LLVM_aspace_implicit_pointer``
> + *can be used are to simplify the DWARF consumer. Similarly, for an implicit
> + pointer value created by* ``DW_OP_deref*`` *and* ``DW_OP_stack_value``\ .*
> +
> +4. ``DW_OP_LLVM_aspace_implicit_pointer`` *New*
> +
> + ``DW_OP_LLVM_aspace_implicit_pointer`` has two operands that are the same as
> + for ``DW_OP_implicit_pointer``.
> +
> + It pops one stack entry that must be an integral type value that represents
> + a target architecture specific address space identifier AS.
> +
> + The location description L that is pushed on the stack is the same as for
> + ``DW_OP_implicit_pointer`` except that the address space identifier used is
> + AS.
> +
> + The DWARF expression is ill-formed if AS is not one of the values defined by
> + the target architecture specific ``DW_ASPACE_*`` values.
> +
> +*Typically a* ``DW_OP_implicit_pointer`` *or*
> +``DW_OP_LLVM_aspace_implicit_pointer`` *operation is used in a DWARF expression
> +E*\ :sub:`1` *of a* ``DW_TAG_variable`` *or* ``DW_TAG_formal_parameter``
> +*debugging information entry D*\ :sub:`1`\ *'s* ``DW_AT_location`` *attribute.
> +The debugging information entry referenced by the* ``DW_OP_implicit_pointer``
> +*or* ``DW_OP_LLVM_aspace_implicit_pointer`` *operations is typically itself a*
> +``DW_TAG_variable`` *or* ``DW_TAG_formal_parameter`` *debugging information
> +entry D*\ :sub:`2` *whose* ``DW_AT_location`` *attribute gives a second DWARF
> +expression E*\ :sub:`2`\ *.*
> +
> +*D*\ :sub:`1` *and E*\ :sub:`1` *are describing the location of a pointer type
> +object. D*\ :sub:`2` *and E*\ :sub:`2` *are describing the location of the
> +object pointed to by that pointer object.*
> +
> +*However, D*\ :sub:`2` *may be any debugging information entry that contains a*
> +``DW_AT_location`` *or* ``DW_AT_const_value`` *attribute (for example,*
> +``DW_TAG_dwarf_procedure``\ *). By using E*\ :sub:`2`\ *, a consumer can
> +reconstruct the value of the object when asked to dereference the pointer
> +described by E*\ :sub:`1` *which contains the* ``DW_OP_implicit_pointer`` or
> +``DW_OP_LLVM_aspace_implicit_pointer`` *operation.*
> +
> +Composite Location Description Operations
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +A composite location storage represents an object or value which may be
> +contained in part of another location storage or contained in parts of more
> +than one location storage.
> +
> +Each part has a part location description L and a part bit size S. L can have
> +one or more single location descriptions SL. If there are more than one SL then
> +that indicates that part is located in more than one place. The bits of each
> +place of the part comprise S contiguous bits from the location storage LS
> +specified by SL starting at the bit offset specified by SL. All the bits must
> +be within the size of LS or the DWARF expression is ill-formed.
> +
> +A composite location storage can have zero or more parts. The parts are
> +contiguous such that the zero-based location storage bit index will range over
> +each part with no gaps between them. Therefore, the size of a composite location
> +storage is the sum of the size of its parts. The DWARF expression is ill-formed
> +if the size of the contiguous location storage is larger than the size of the
> +memory location storage corresponding to the largest target architecture
> +specific address space.
> +
> +A composite location description specifies a composite location storage. The bit
> +offset corresponds to a bit position within the composite location storage.
> +
> +There are operations that create a composite location storage.
> +
> +There are other operations that allow a composite location storage to be
> +incrementally created. Each part is created by a separate operation. There may
> +be one or more operations to create the final composite location storage. A
> +series of such operations describes the parts of the composite location storage
> +that are in the order that the associated part operations are executed.
> +
> +To support incremental creation, a composite location storage can be in an
> +incomplete state. When an incremental operation operates on an incomplete
> +composite location storage, it adds a new part, otherwise it creates a new
> +composite location storage. The ``DW_OP_LLVM_piece_end`` operation explicitly
> +makes an incomplete composite location storage complete.
> +
> +A composite location description that specifies a composite location storage
> +that is incomplete is termed an incomplete composite location description. A
> +composite location description that specifies a composite location storage that
> +is complete is termed a complete composite location description.
> +
> +If the top stack entry is a location description that has one incomplete
> +composite location description SL after the execution of an operation expression
> +has completed, SL is converted to a complete composite location description.
> +
> +*Note that this conversion does not happen after the completion of an operation
> +expression that is evaluated on the same stack by the* ``DW_OP_call*``
> +*operations. Such executions are not a separate evaluation of an operation
> +expression, but rather the continued evaluation of the same operation expression
> +that contains the* ``DW_OP_call*`` *operation.*
> +
> +If a stack entry is required to be a location description L, but L has an
> +incomplete composite location description, then the DWARF expression is
> +ill-formed. The exception is for the operations involved in incrementally
> +creating a composite location description as described below.
> +
> +*Note that a DWARF operation expression may arbitrarily compose composite
> +location descriptions from any other location description, including those that
> +have multiple single location descriptions, and those that have composite
> +location descriptions.*
> +
> +*The incremental composite location description operations are defined to be
> +compatible with the definitions in DWARF Version 5.*
> +
> +1. ``DW_OP_piece``
> +
> + ``DW_OP_piece`` has a single unsigned LEB128 integer that represents a byte
> + size S.
> +
> + The action is based on the context:
> +
> + * If the stack is empty, then a location description L comprised of one
> + incomplete composite location description SL is pushed on the stack.
> +
> + An incomplete composite location storage LS is created with a single part
> + P. P specifies a location description PL and has a bit size of S scaled by
> + 8 (the byte size). PL is comprised of one undefined location description
> + PSL.
> +
> + SL specifies LS with a bit offset of 0.
> +
> + * Otherwise, if the top stack entry is a location description L comprised of
> + one incomplete composite location description SL, then the incomplete
> + composite location storage LS that SL specifies is updated to append a new
> + part P. P specifies a location description PL and has a bit size of S
> + scaled by 8 (the byte size). PL is comprised of one undefined location
> + description PSL. L is left on the stack.
> +
> + * Otherwise, if the top stack entry is a location description or can be
> + converted to one, then it is popped and treated as a part location
> + description PL. Then:
> +
> + * If the top stack entry (after popping PL) is a location description L
> + comprised of one incomplete composite location description SL, then the
> + incomplete composite location storage LS that SL specifies is updated to
> + append a new part P. P specifies the location description PL and has a
> + bit size of S scaled by 8 (the byte size). L is left on the stack.
> +
> + * Otherwise, a location description L comprised of one incomplete
> + composite location description SL is pushed on the stack.
> +
> + An incomplete composite location storage LS is created with a single
> + part P. P specifies the location description PL and has a bit size of S
> + scaled by 8 (the byte size).
> +
> + SL specifies LS with a bit offset of 0.
> +
> + * Otherwise, the DWARF expression is ill-formed
> +
> + *Many compilers store a single variable in sets of registers or store a
> + variable partially in memory and partially in registers.* ``DW_OP_piece``
> + *provides a way of describing where a part of a variable is located.*
> +
> + *If a non-0 byte displacement is required, the* ``DW_OP_LLVM_offset``
> + *operation can be used to update the location description before using it as
> + the part location description of a* ``DW_OP_piece`` *operation.*
> +
> + *The evaluation rules for the* ``DW_OP_piece`` *operation allow it to be
> + compatible with the DWARF Version 5 definition.*
> +
> + .. note::
> +
> + Since this proposal allows location descriptions to be entries on the
> + stack, a simpler operation to create composite location descriptions. For
> + example, just one operation that specifies how many parts, and pops pairs
> + of stack entries for the part size and location description. Not only
> + would this be a simpler operation and avoid the complexities of incomplete
> + composite location descriptions, but it may also have a smaller encoding
> + in practice. However, the desire for compatibility with DWARF Version 5 is
> + likely a stronger consideration.
> +
> +2. ``DW_OP_bit_piece``
> +
> + ``DW_OP_bit_piece`` has two operands. The first is an unsigned LEB128
> + integer that represents the part bit size S. The second is an unsigned
> + LEB128 integer that represents a bit displacement B.
> +
> + The action is the same as for ``DW_OP_piece`` except that any part created
> + has the bit size S, and the location description PL of any created part is
> + updated as if the ``DW_OP_constu B; DW_OP_LLVM_bit_offset`` operations were
> + applied.
> +
> + ``DW_OP_bit_piece`` *is used instead of* ``DW_OP_piece`` *when the piece to
> + be assembled is not byte-sized or is not at the start of the part location
> + description.*
> +
> + *If a computed bit displacement is required, the* ``DW_OP_LLVM_bit_offset``
> + *operation can be used to update the location description before using it as
> + the part location description of a* ``DW_OP_bit_piece`` *operation.*
> +
> + .. note::
> +
> + The bit offset operand is not needed as ``DW_OP_LLVM_bit_offset`` can be
> + used on the part's location description.
> +
> +3. ``DW_OP_LLVM_piece_end`` *New*
> +
> + If the top stack entry is not a location description L comprised of one
> + incomplete composite location description SL, then the DWARF expression is
> + ill-formed.
> +
> + Otherwise, the incomplete composite location storage LS specified by SL is
> + updated to be a complete composite location description with the same parts.
> +
> +4. ``DW_OP_LLVM_extend`` *New*
> +
> + ``DW_OP_LLVM_extend`` has two operands. The first is an unsigned LEB128
> + integer that represents the element bit size S. The second is an unsigned
> + LEB128 integer that represents a count C.
> +
> + It pops one stack entry that must be a location description and is treated
> + as the part location description PL.
> +
> + A location description L comprised of one complete composite location
> + description SL is pushed on the stack.
> +
> + A complete composite location storage LS is created with C identical parts
> + P. Each P specifies PL and has a bit size of S.
> +
> + SL specifies LS with a bit offset of 0.
> +
> + The DWARF expression is ill-formed if the element bit size or count are 0.
> +
> +5. ``DW_OP_LLVM_select_bit_piece`` *New*
> +
> + ``DW_OP_LLVM_select_bit_piece`` has two operands. The first is an unsigned
> + LEB128 integer that represents the element bit size S. The second is an
> + unsigned LEB128 integer that represents a count C.
> +
> + It pops three stack entries. The first must be an integral type value that
> + represents a bit mask value M. The second must be a location description
> + that represents the one-location description L1. The third must be a
> + location description that represents the zero-location description L0.
> +
> + A complete composite location storage LS is created with C parts P\ :sub:`N`
> + ordered in ascending N from 0 to C-1 inclusive. Each P\ :sub:`N` specifies
> + location description PL\ :sub:`N` and has a bit size of S.
> +
> + PL\ :sub:`N` is as if the ``DW_OP_LLVM_bit_offset N*S`` operation was
> + applied to PLX\ :sub:`N`\ .
> +
> + PLX\ :sub:`N` is the same as L0 if the N\ :sup:`th` least significant bit of
> + M is a zero, otherwise it is the same as L1.
> +
> + A location description L comprised of one complete composite location
> + description SL is pushed on the stack. SL specifies LS with a bit offset of
> + 0.
> +
> + The DWARF expression is ill-formed if S or C are 0, or if the bit size of M
> + is less than C.
> +
> +.. _amdgpu-dwarf-location-list-expressions:
> +
> +DWARF Location List Expressions
> ++++++++++++++++++++++++++++++++
> +
> +*To meet the needs of recent computer architectures and optimization techniques,
> +debugging information must be able to describe the location of an object whose
> +location changes over the object’s lifetime, and may reside at multiple
> +locations during parts of an object's lifetime. Location list expressions are
> +used in place of operation expressions whenever the object whose location is
> +being described has these requirements.*
> +
> +A location list expression consists of a series of location list entries. Each
> +location list entry is one of the following kinds:
> +
> +*Bounded location description*
> +
> + This kind of location list entry provides an operation expression that
> + evaluates to the location description of an object that is valid over a
> + lifetime bounded by a starting and ending address. The starting address is the
> + lowest address of the address range over which the location is valid. The
> + ending address is the address of the first location past the highest address
> + of the address range.
> +
> + The location list entry matches when the current program location is within
> + the given range.
> +
> + There are several kinds of bounded location description entries which
> diff er
> + in the way that they specify the starting and ending addresses.
> +
> +*Default location description*
> +
> + This kind of location list entry provides an operation expression that
> + evaluates to the location description of an object that is valid when no
> + bounded location description entry applies.
> +
> + The location list entry matches when the current program location is not
> + within the range of any bounded location description entry.
> +
> +*Base address*
> +
> + This kind of location list entry provides an address to be used as the base
> + address for beginning and ending address offsets given in certain kinds of
> + bounded location description entries. The applicable base address of a bounded
> + location description entry is the address specified by the closest preceding
> + base address entry in the same location list. If there is no preceding base
> + address entry, then the applicable base address defaults to the base address
> + of the compilation unit (see DWARF Version 5 section 3.1.1).
> +
> + In the case of a compilation unit where all of the machine code is contained
> + in a single contiguous section, no base address entry is needed.
> +
> +*End-of-list*
> +
> + This kind of location list entry marks the end of the location list
> + expression.
> +
> +The address ranges defined by the bounded location description entries of a
> +location list expression may overlap. When they do, they describe a situation in
> +which an object exists simultaneously in more than one place.
> +
> +If all of the address ranges in a given location list expression do not
> +collectively cover the entire range over which the object in question is
> +defined, and there is no following default location description entry, it is
> +assumed that the object is not available for the portion of the range that is
> +not covered.
> +
> +The operation expression of each matching location list entry is evaluated as a
> +location description and its result is returned as the result of the location
> +list entry. The operation expression is evaluated with the same context as the
> +location list expression, including the same current frame, current program
> +location, and initial stack.
> +
> +The result of the evaluation of a DWARF location list expression is a location
> +description that is comprised of the union of the single location descriptions
> +of the location description result of each matching location list entry. If
> +there are no matching location list entries, then the result is a location
> +description that comprises one undefined location description.
> +
> +A location list expression can only be used as the value of a debugger
> +information entry attribute that is encoded using class ``loclist`` or
> +``loclistsptr`` (see DWARF Version 5 section 7.5.5). The value of the attribute
> +provides an index into a separate object file section called ``.debug_loclists``
> +or ``.debug_loclists.dwo`` (for split DWARF object files) that contains the
> +location list entries.
> +
> +A ``DW_OP_call*`` and ``DW_OP_implicit_pointer`` operation can be used to
> +specify a debugger information entry attribute that has a location list
> +expression. Several debugger information entry attributes allow DWARF
> +expressions that are evaluated with an initial stack that includes a location
> +description that may originate from the evaluation of a location list
> +expression.
> +
> +*This location list representation, the* ``loclist`` *and* ``loclistsptr``
> +*class, and the related* ``DW_AT_loclists_base`` *attribute are new in DWARF
> +Version 5. Together they eliminate most, or all of the code object relocations
> +previously needed for location list expressions.*
> +
> +.. note::
> +
> + The rest of this section is the same as DWARF Version 5 section 2.6.2.
> +
> +.. _amdgpu-dwarf-segment_addresses:
> +
> +Segmented Addresses
> +~~~~~~~~~~~~~~~~~~~
> +
> +.. note::
> +
> + This augments DWARF Version 5 section 2.12.
> +
> +DWARF address classes are used for source languages that have the concept of
> +memory spaces. They are used in the ``DW_AT_address_class`` attribute for
> +pointer type, reference type, subprogram, and subprogram type debugger
> +information entries.
> +
> +Each DWARF address class is conceptually a separate source language memory space
> +with its own lifetime and aliasing rules. DWARF address classes are used to
> +specify the source language memory spaces that pointer type and reference type
> +values refer, and to specify the source language memory space in which variables
> +are allocated.
> +
> +The set of currently defined source language DWARF address classes, together
> +with source language mappings, is given in
> +:ref:`amdgpu-dwarf-address-class-table`.
> +
> +Vendor defined source language address classes may be defined using codes in the
> +range ``DW_ADDR_LLVM_lo_user`` to ``DW_ADDR_LLVM_hi_user``.
> +
> +.. table:: Address class
> + :name: amdgpu-dwarf-address-class-table
> +
> + ========================= ============ ========= ========= =========
> + Address Class Name Meaning C/C++ OpenCL CUDA/HIP
> + ========================= ============ ========= ========= =========
> + ``DW_ADDR_none`` generic *default* generic *default*
> + ``DW_ADDR_LLVM_global`` global global
> + ``DW_ADDR_LLVM_constant`` constant constant constant
> + ``DW_ADDR_LLVM_group`` thread-group local shared
> + ``DW_ADDR_LLVM_private`` thread private
> + ``DW_ADDR_LLVM_lo_user``
> + ``DW_ADDR_LLVM_hi_user``
> + ========================= ============ ========= ========= =========
> +
> +DWARF address spaces correspond to target architecture specific linear
> +addressable memory areas. They are used in DWARF expression location
> +descriptions to describe in which target architecture specific memory area data
> +resides.
> +
> +*Target architecture specific DWARF address spaces may correspond to hardware
> +supported facilities such as memory utilizing base address registers, scratchpad
> +memory, and memory with special interleaving. The size of addresses in these
> +address spaces may vary. Their access and allocation may be hardware managed
> +with each thread or group of threads having access to independent storage. For
> +these reasons they may have properties that do not allow them to be viewed as
> +part of the unified global virtual address space accessible by all threads.*
> +
> +*It is target architecture specific whether multiple DWARF address spaces are
> +supported and how source language DWARF address classes map to target
> +architecture specific DWARF address spaces. A target architecture may map
> +multiple source language DWARF address classes to the same target architecture
> +specific DWARF address class. Optimization may determine that variable lifetime
> +and access pattern allows them to be allocated in faster scratchpad memory
> +represented by a
> diff erent DWARF address space.*
> +
> +Although DWARF address space identifiers are target architecture specific,
> +``DW_ASPACE_none`` is a common address space supported by all target
> +architectures.
> +
> +DWARF address space identifiers are used by:
> +
> +* The DWARF expession operations: ``DW_OP_LLVM_aspace_bregx``,
> + ``DW_OP_LLVM_form_aspace_address``, ``DW_OP_LLVM_implicit_aspace_pointer``,
> + and ``DW_OP_xderef*``.
> +
> +* The CFI instructions: ``DW_CFA_def_aspace_cfa`` and
> + ``DW_CFA_def_aspace_cfa_sf``.
> +
> +.. note::
> +
> + With the definition of DWARF address classes and DWARF address spaces in this
> + proposal, DWARF Version 5 table 2.7 needs to be updated. It seems it is an
> + example of DWARF address spaces and not DWARF address classes.
> +
> +.. note::
> +
> + With the expanded support for DWARF address spaces in this proposal, it may be
> + worth examining if DWARF segments can be eliminated and DWARF address spaces
> + used instead.
> +
> + That may involve extending DWARF address spaces to also be used to specify
> + code locations. In target architectures that use
> diff erent memory areas for
> + code and data this would seem a natural use for DWARF address spaces. This
> + would allow DWARF expression location descriptions to be used to describe the
> + location of subprograms and entry points that are used in expressions
> + involving subprogram pointer type values.
> +
> + Currently, DWARF expressions assume data and code resides in the same default
> + DWARF address space, and only the address ranges in DWARF location list
> + entries and in the ``.debug_aranges`` section for accelerated access for
> + addresses allow DWARF segments to be used to distinguish.
> +
> +.. note::
> +
> + Currently, DWARF defines address class values as being target architecture
> + specific. It is unclear how language specific memory spaces are intended to be
> + represented in DWARF using these.
> +
> + For example, OpenCL defines memory spaces (called address spaces in OpenCL)
> + for ``global``, ``local``, ``constant``, and ``private``. These are part of
> + the type system and are modifiers to pointer types. In addition, OpenCL
> + defines ``generic`` pointers that can reference either the ``global``,
> + ``local``, or ``private`` memory spaces. To support the OpenCL language the
> + debugger would want to support casting pointers between the ``generic`` and
> + other memory spaces, querying what memory space a ``generic`` pointer value is
> + currently referencing, and possibly using pointer casting to form an address
> + for a specific memory space out of an integral value.
> +
> + The method to use to dereference a pointer type or reference type value is
> + defined in DWARF expressions using ``DW_OP_xderef*`` which uses a target
> + architecture specific address space.
> +
> + DWARF defines the ``DW_AT_address_class`` attribute on pointer type and
> + reference type debugger information entries. It specifies the method to use to
> + dereference them. Why is the value of this not the same as the address space
> + value used in ``DW_OP_xderef*``? In both cases it is target architecture
> + specific and the architecture presumably will use the same set of methods to
> + dereference pointers in both cases.
> +
> + Since ``DW_AT_address_class`` uses a target architecture specific value, it
> + cannot in general capture the source language memory space type modifier
> + concept. On some architectures all source language memory space modifiers may
> + actually use the same method for dereferencing pointers.
> +
> + One possibility is for DWARF to add an ``DW_TAG_LLVM_address_class_type``
> + debugger information entry type modifier that can be applied to a pointer type
> + and reference type. The ``DW_AT_address_class`` attribute could be re-defined
> + to not be target architecture specific and instead define generalized language
> + values (as is proposed above for DWARF address classes in the table
> + :ref:`amdgpu-dwarf-address-class-table`) that will support OpenCL and other
> + languages using memory spaces. The ``DW_AT_address_class`` attribute could be
> + defined to not be applied to pointer types or reference types, but instead
> + only to the new ``DW_TAG_LLVM_address_class_type`` type modifier debugger
> + information entry.
> +
> + If a pointer type or reference type is not modified by
> + ``DW_TAG_LLVM_address_class_type`` or if ``DW_TAG_LLVM_address_class_type``
> + has no ``DW_AT_address_class`` attribute, then the pointer type or reference
> + type would be defined to use the ``DW_ADDR_none`` address class as currently.
> + Since modifiers can be chained, it would need to be defined if multiple
> + ``DW_TAG_LLVM_address_class_type`` modifiers were legal, and if so if the
> + outermost one is the one that takes precedence.
> +
> + A target architecture implementation that supports multiple address spaces
> + would need to map ``DW_ADDR_none`` appropriately to support CUDA-like
> + languages that have no address classes in the type system but do support
> + variable allocation in address classes. Such variable allocation would result
> + in the variable's location description needing an address space.
> +
> + The approach proposed in :ref:`amdgpu-dwarf-address-class-table` is to define
> + the default ``DW_ADDR_none`` to be the generic address class and not the
> + global address class. This matches how CLANG and LLVM have added support for
> + CUDA-like languages on top of existing C++ language support. This allows all
> + addresses to be generic by default which matches CUDA-like languages.
> +
> + An alternative approach is to define ``DW_ADDR_none`` as being the global
> + address class and then change ``DW_ADDR_LLVM_global`` to
> + ``DW_ADDR_LLVM_generic``. This would match the reality that languages that do
> + not support multiple memory spaces only have one default global memory space.
> + Generally, in these languages if they expose that the target architecture
> + supports multiple address spaces, the default one is still the global memory
> + space. Then a language that does support multiple memory spaces has to
> + explicitly indicate which pointers have the added ability to reference more
> + than the global memory space. However, compilers generating DWARF for
> + CUDA-like languages would then have to define every CUDA-like language pointer
> + type or reference type using ``DW_TAG_LLVM_address_class_type`` with a
> + ``DW_AT_address_class`` attribute of ``DW_ADDR_LLVM_generic`` to match the
> + language semantics.
> +
> + A new ``DW_AT_LLVM_address_space`` attribute could be defined that can be
> + applied to pointer type, reference type, subprogram, and subprogram type to
> + describe how objects having the given type are dereferenced or called (the
> + role that ``DW_AT_address_class`` currently provides). The values of
> + ``DW_AT_address_space`` would be target architecture specific and the same as
> + used in ``DW_OP_xderef*``.
> +
> +.. _amdgpu-dwarf-debugging-information-entry-attributes:
> +
> +Debugging Information Entry Attributes
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +.. note::
> +
> + This section provides changes to existing debugger information entry
> + attributes and defines attributes added by the proposal. These would be
> + incorporated into the appropriate DWARF Version 5 chapter 2 sections.
> +
> +1. ``DW_AT_location``
> +
> + Any debugging information entry describing a data object (which includes
> + variables and parameters) or common blocks may have a ``DW_AT_location``
> + attribute, whose value is a DWARF expression E.
> +
> + The result of the attribute is obtained by evaluating E as a location
> + description in the context of the current subprogram, current program
> + location, and with an empty initial stack. See
> + :ref:`amdgpu-dwarf-expressions`.
> +
> + See :ref:`amdgpu-dwarf-control-flow-operations` for special evaluation rules
> + used by the ``DW_OP_call*`` operations.
> +
> + .. note::
> +
> + Delete the description of how the ``DW_OP_call*`` operations evaluate a
> + ``DW_AT_location`` attribute as that is now described in the operations.
> +
> + .. note::
> +
> + See the discussion about the ``DW_AT_location`` attribute in the
> + ``DW_OP_call*`` operation. Having each attribute only have a single
> + purpose and single execution semantics seems desirable. It makes it easier
> + for the consumer that no longer have to track the context. It makes it
> + easier for the producer as it can rely on a single semantics for each
> + attribute.
> +
> + For that reason, limiting the ``DW_AT_location`` attribute to only
> + supporting evaluating the location description of an object, and using a
> +
> diff erent attribute and encoding class for the evaluation of DWARF
> + expression *procedures* on the same operation expression stack seems
> + desirable.
> +
> +2. ``DW_AT_const_value``
> +
> + .. note::
> +
> + Could deprecate using the ``DW_AT_const_value`` attribute for
> + ``DW_TAG_variable`` or ``DW_TAG_formal_parameter`` debugger information
> + entries that have been optimized to a constant. Instead,
> + ``DW_AT_location`` could be used with a DWARF expression that produces an
> + implicit location description now that any location description can be
> + used within a DWARF expression. This allows the ``DW_OP_call*`` operations
> + to be used to push the location description of any variable regardless of
> + how it is optimized.
> +
> +3. ``DW_AT_frame_base``
> +
> + A ``DW_TAG_subprogram`` or ``DW_TAG_entry_point`` debugger information entry
> + may have a ``DW_AT_frame_base`` attribute, whose value is a DWARF expression
> + E.
> +
> + The result of the attribute is obtained by evaluating E as a location
> + description in the context of the current subprogram, current program
> + location, and with an empty initial stack.
> +
> + The DWARF is ill-formed if E contains an ``DW_OP_fbreg`` operation, or the
> + resulting location description L is not comprised of one single location
> + description SL.
> +
> + If SL a register location description for register R, then L is replaced
> + with the result of evaluating a ``DW_OP_bregx R, 0`` operation. This
> + computes the frame base memory location description in the target
> + architecture default address space.
> +
> + *This allows the more compact* ``DW_OPreg*`` *to be used instead of*
> + ``DW_OP_breg* 0``\ *.*
> +
> + .. note::
> +
> + This rule could be removed and require the producer to create the required
> + location description directly using ``DW_OP_call_frame_cfa``,
> + ``DW_OP_breg*``, or ``DW_OP_LLVM_aspace_bregx``. This would also then
> + allow a target to implement the call frames within a large register.
> +
> + Otherwise, the DWARF is ill-formed if SL is not a memory location
> + description in any of the target architecture specific address spaces.
> +
> + The resulting L is the *frame base* for the subprogram or entry point.
> +
> + *Typically, E will use the* ``DW_OP_call_frame_cfa`` *operation or be a
> + stack pointer register plus or minus some offset.*
> +
> +4. ``DW_AT_data_member_location``
> +
> + For a ``DW_AT_data_member_location`` attribute there are two cases:
> +
> + 1. If the attribute is an integer constant B, it provides the offset in
> + bytes from the beginning of the containing entity.
> +
> + The result of the attribute is obtained by evaluating a
> + ``DW_OP_LLVM_offset B`` operation with an initial stack comprising the
> + location description of the beginning of the containing entity. The
> + result of the evaluation is the location description of the base of the
> + member entry.
> +
> + *If the beginning of the containing entity is not byte aligned, then the
> + beginning of the member entry has the same bit displacement within a
> + byte.*
> +
> + 2. Otherwise, the attribute must be a DWARF expression E which is evaluated
> + with a context of the current frame, current program location, and an
> + initial stack comprising the location description of the beginning of
> + the containing entity. The result of the evaluation is the location
> + description of the base of the member entry.
> +
> + .. note::
> +
> + The beginning of the containing entity can now be any location
> + description, including those with more than one single location
> + description, and those with single location descriptions that are of any
> + kind and have any bit offset.
> +
> +5. ``DW_AT_use_location``
> +
> + The ``DW_TAG_ptr_to_member_type`` debugging information entry has a
> + ``DW_AT_use_location`` attribute whose value is a DWARF expression E. It is
> + used to compute the location description of the member of the class to which
> + the pointer to member entry points.
> +
> + *The method used to find the location description of a given member of a
> + class, structure, or union is common to any instance of that class,
> + structure, or union and to any instance of the pointer to member type. The
> + method is thus associated with the pointer to member type, rather than with
> + each object that has a pointer to member type.*
> +
> + The ``DW_AT_use_location`` DWARF expression is used in conjunction with the
> + location description for a particular object of the given pointer to member
> + type and for a particular structure or class instance.
> +
> + The result of the attribute is obtained by evaluating E as a location
> + description with the context of the current subprogram, current program
> + location, and an initial stack comprising two entries. The first entry is
> + the value of the pointer to member object itself. The second entry is the
> + location description of the base of the entire class, structure, or union
> + instance containing the member whose location is being calculated.
> +
> +6. ``DW_AT_data_location``
> +
> + The ``DW_AT_data_location`` attribute may be used with any type that
> + provides one or more levels of hidden indirection and/or run-time parameters
> + in its representation. Its value is a DWARF operation expression E which
> + computes the location description of the data for an object. When this
> + attribute is omitted, the location description of the data is the same as
> + the location description of the object.
> +
> + The result of the attribute is obtained by evaluating E as a location
> + description with the context of the current subprogram, current program
> + location, and an empty initial stack.
> +
> + *E will typically involve an operation expression that begins with a*
> + ``DW_OP_push_object_address`` *operation which loads the location
> + description of the object which can then serve as a description in
> + subsequent calculation.*
> +
> + .. note::
> +
> + Since ``DW_AT_data_member_location``, ``DW_AT_use_location``, and
> + ``DW_AT_vtable_elem_location`` allow both operation expressions and
> + location list expressions, why does ``DW_AT_data_location`` not allow
> + both? In all cases they apply to data objects so less likely that
> + optimization would cause
> diff erent operation expressions for
> diff erent
> + program location ranges. But if supporting for some then should be for
> + all.
> +
> + It seems odd this attribute is not the same as
> + ``DW_AT_data_member_location`` in having an initial stack with the
> + location description of the object since the expression has to need it.
> +
> +7. ``DW_AT_vtable_elem_location``
> +
> + An entry for a virtual function also has a ``DW_AT_vtable_elem_location``
> + attribute whose value is a DWARF expression E.
> +
> + The result of the attribute is obtained by evaluating E as a location
> + description with the context of the current subprogram, current program
> + location, and an initial stack comprising the location description of the
> + object of the enclosing type.
> +
> + The resulting location description is the slot for the function within the
> + virtual function table for the enclosing class.
> +
> +8. ``DW_AT_static_link``
> +
> + If a ``DW_TAG_subprogram`` or ``DW_TAG_entry_point`` debugger information
> + entry is lexically nested, it may have a ``DW_AT_static_link`` attribute,
> + whose value is a DWARF expression E.
> +
> + The result of the attribute is obtained by evaluating E as a location
> + description with the context of the current subprogram, current program
> + location, and an empty initial stack.
> +
> + The DWARF is ill-formed if the resulting location description L is is not
> + comprised of one memory location description in any of the target
> + architecture specific address spaces.
> +
> + The resulting L is the *frame base* of the relevant instance of the
> + subprogram that immediately lexically encloses the subprogram or entry
> + point.
> +
> +9. ``DW_AT_return_addr``
> +
> + A ``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or
> + ``DW_TAG_entry_point`` debugger information entry may have a
> + ``DW_AT_return_addr`` attribute, whose value is a DWARF expression E.
> +
> + The result of the attribute is obtained by evaluating E as a location
> + description with the context of the current subprogram, current program
> + location, and an empty initial stack.
> +
> + The DWARF is ill-formed if the resulting location description L is not
> + comprised one memory location description in any of the target architecture
> + specific address spaces.
> +
> + The resulting L is the place where the return address for the subprogram or
> + entry point is stored.
> +
> + .. note::
> +
> + It is unclear why ``DW_TAG_inlined_subroutine`` has a
> + ``DW_AT_return_addr`` attribute but not a ``DW_AT_frame_base`` or
> + ``DW_AT_static_link`` attribute. Seems it would either have all of them or
> + none. Since inlined subprograms do not have a frame it seems they would
> + have none of these attributes.
> +
> +10. ``DW_AT_call_value``, ``DW_AT_call_data_location``, and ``DW_AT_call_data_value``
> +
> + A ``DW_TAG_call_site_parameter`` debugger information entry may have a
> + ``DW_AT_call_value`` attribute, whose value is a DWARF operation expression
> + E\ :sub:`1`\ .
> +
> + The result of the ``DW_AT_call_value`` attribute is obtained by evaluating
> + E\ :sub:`1` as a value with the context of the call site subprogram, call
> + site program location, and an empty initial stack.
> +
> + The call site subprogram is the subprogram containing the
> + ``DW_TAG_call_site_parameter`` debugger information entry. The call site
> + program location is the location of call site in the call site subprogram.
> +
> + *The consumer may have to virtually unwind to the call site in order to
> + evaluate the attribute. This will provide both the call site subprogram and
> + call site program location needed to evaluate the expression.*
> +
> + The resulting value V\ :sub:`1` is the value of the parameter at the time of
> + the call made by the call site.
> +
> + For parameters passed by reference, where the code passes a pointer to a
> + location which contains the parameter, or for reference type parameters, the
> + ``DW_TAG_call_site_parameter`` debugger information entry may also have a
> + ``DW_AT_call_data_location`` attribute whose value is a DWARF operation
> + expression E\ :sub:`2`\ , and a ``DW_AT_call_data_value`` attribute whose
> + value is a DWARF operation expression E\ :sub:`3`\ .
> +
> + The value of the ``DW_AT_call_data_location`` attribute is obtained by
> + evaluating E\ :sub:`2` as a location description with the context of the
> + call site subprogram, call site program location, and an empty initial
> + stack.
> +
> + The resulting location description L\ :sub:`2` is the location where the
> + referenced parameter lives during the call made by the call site. If E\
> + :sub:`2` would just be a ``DW_OP_push_object_address``, then the
> + ``DW_AT_call_data_location`` attribute may be omitted.
> +
> + The value of the ``DW_AT_call_data_value`` attribute is obtained by
> + evaluating E\ :sub:`3` as a value with the context of the call site
> + subprogram, call site program location, and an empty initial stack.
> +
> + The resulting value V\ :sub:`3` is the value in L\ :sub:`2` at the time of
> + the call made by the call site.
> +
> + If it is not possible to avoid the expressions of these attributes from
> + accessing registers or memory locations that might be clobbered by the
> + subprogram being called by the call site, then the associated attribute
> + should not be provided.
> +
> + *The reason for the restriction is that the parameter may need to be
> + accessed during the execution of the callee. The consumer may virtually
> + unwind from the called subprogram back to the caller and then evaluate the
> + attribute expressions. The call frame information (see*
> + :ref:`amdgpu-dwarf-call-frame-information`\ *) will not be able to restore
> + registers that have been clobbered, and clobbered memory will no longer have
> + the value at the time of the call.*
> +
> +11. ``DW_AT_LLVM_lanes`` *New*
> +
> + For languages that are implemented using a SIMD or SIMT execution model, a
> + ``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or
> + ``DW_TAG_entry_point`` debugger information entry may have a
> + ``DW_AT_LLVM_lanes`` attribute whose value is an integer constant that is
> + the number of lanes per thread. This is the static number of lanes per
> + thread. It is not the dynamic number of lanes with which the thread was
> + initiated, for example, due to smaller or partial work-groups.
> +
> + If not present, the default value of 1 is used.
> +
> + The DWARF is ill-formed if the value is 0.
> +
> +12. ``DW_AT_LLVM_lane_pc`` *New*
> +
> + For languages that are implemented using a SIMD or SIMT execution model, a
> + ``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or
> + ``DW_TAG_entry_point`` debugging information entry may have a
> + ``DW_AT_LLVM_lane_pc`` attribute whose value is a DWARF expression E.
> +
> + The result of the attribute is obtained by evaluating E as a location
> + description with the context of the current subprogram, current program
> + location, and an empty initial stack.
> +
> + The resulting location description L is for a thread lane count sized vector
> + of generic type elements. The thread lane count is the value of the
> + ``DW_AT_LLVM_lanes`` attribute. Each element holds the conceptual program
> + location of the corresponding lane, where the least significant element
> + corresponds to the first target architecture specific lane identifier and so
> + forth. If the lane was not active when the current subprogram was called,
> + its element is an undefined location description.
> +
> + ``DW_AT_LLVM_lane_pc`` *allows the compiler to indicate conceptually where
> + each lane of a SIMT thread is positioned even when it is in divergent
> + control flow that is not active.*
> +
> + *Typically, the result is a location description with one composite location
> + description with each part being a location description with either one
> + undefined location description or one memory location description.*
> +
> + If not present, the thread is not being used in a SIMT manner, and the
> + thread's current program location is used.
> +
> +13. ``DW_AT_LLVM_active_lane`` *New*
> +
> + For languages that are implemented using a SIMD or SIMT execution model, a
> + ``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or
> + ``DW_TAG_entry_point`` debugger information entry may have a
> + ``DW_AT_LLVM_active_lane`` attribute whose value is a DWARF expression E.
> +
> + The result of the attribute is obtained by evaluating E as a value with the
> + context of the current subprogram, current program location, and an empty
> + initial stack.
> +
> + The DWARF is ill-formed if the resulting value V is not an integral value.
> +
> + The resulting V is a bit mask of active lanes for the current program
> + location. The N\ :sup:`th` least significant bit of the mask corresponds to
> + the N\ :sup:`th` lane. If the bit is 1 the lane is active, otherwise it is
> + inactive.
> +
> + *Some targets may update the target architecture execution mask for regions
> + of code that must execute with
> diff erent sets of lanes than the current
> + active lanes. For example, some code must execute with all lanes made
> + temporarily active.* ``DW_AT_LLVM_active_lane`` *allows the compiler to
> + provide the means to determine the source language active lanes.*
> +
> + If not present and ``DW_AT_LLVM_lanes`` is greater than 1, then the target
> + architecture execution mask is used.
> +
> +14. ``DW_AT_LLVM_vector_size`` *New*
> +
> + A ``DW_TAG_base_type`` debugger information entry for a base type T may have
> + a ``DW_AT_LLVM_vector_size`` attribute whose value is an integer constant
> + that is the vector type size N.
> +
> + The representation of a vector base type is as N contiguous elements, each
> + one having the representation of a base type T' that is the same as T
> + without the ``DW_AT_LLVM_vector_size`` attribute.
> +
> + If a ``DW_TAG_base_type`` debugger information entry does not have a
> + ``DW_AT_LLVM_vector_size`` attribute, then the base type is not a vector
> + type.
> +
> + The DWARF is ill-formed if N is not greater than 0.
> +
> + .. note::
> +
> + LLVM has mention of a non-upstreamed debugger information entry that is
> + intended to support vector types. However, that was not for a base type so
> + would not be suitable as the type of a stack value entry. But perhaps that
> + could be replaced by using this attribute.
> +
> +15. ``DW_AT_LLVM_augmentation`` *New*
> +
> + A ``DW_TAG_compile_unit`` debugger information entry for a compilation unit
> + may have a ``DW_AT_LLVM_augmentation`` attribute, whose value is an
> + augmentation string.
> +
> + *The augmentation string allows producers to indicate that there is
> + additional vendor or target specific information in the debugging
> + information entries. For example, this might be information about the
> + version of vendor specific extensions that are being used.*
> +
> + If not present, or if the string is empty, then the compilation unit has no
> + augmentation string.
> +
> + The format for the augmentation string is:
> +
> + | ``[``\ *vendor*\ ``v``\ *X*\ ``.``\ *Y*\ [\ ``:``\ *options*\ ]\ ``]``\ *
> +
> + Where *vendor* is the producer, ``vX.Y`` specifies the major X and minor Y
> + version number of the extensions used, and *options* is an optional string
> + providing additional information about the extensions. The version number
> + must conform to semantic versioning [:ref:`SEMVER <amdgpu-dwarf-SEMVER>`].
> + The *options* string must not contain the "\ ``]``\ " character.
> +
> + For example:
> +
> + ::
> +
> + [abc:v0.0][def:v1.2:feature-a=on,feature-b=3]
> +
> +Program Scope Entities
> +----------------------
> +
> +.. _amdgpu-dwarf-language-names:
> +
> +Unit Entities
> +~~~~~~~~~~~~~
> +
> +.. note::
> +
> + This augments DWARF Version 5 section 3.1.1 and Table 3.1.
> +
> +Additional language codes defined for use with the ``DW_AT_language`` attribute
> +are defined in :ref:`amdgpu-dwarf-language-names-table`.
> +
> +.. table:: Language Names
> + :name: amdgpu-dwarf-language-names-table
> +
> + ==================== =============================
> + Language Name Meaning
> + ==================== =============================
> + ``DW_LANG_LLVM_HIP`` HIP Language.
> + ==================== =============================
> +
> +The HIP language [:ref:`HIP <amdgpu-dwarf-HIP>`] can be supported by extending
> +the C++ language.
> +
> +Other Debugger Information
> +--------------------------
> +
> +Accelerated Access
> +~~~~~~~~~~~~~~~~~~
> +
> +.. _amdgpu-dwarf-lookup-by-name:
> +
> +Lookup By Name
> +++++++++++++++
> +
> +Contents of the Name Index
> +##########################
> +
> +.. note::
> +
> + The following provides changes to DWARF Version 5 section 6.1.1.1.
> +
> + The rule for debugger information entries included in the name index in the
> + optional ``.debug_names`` section is extended to also include named
> + ``DW_TAG_variable`` debugging information entries with a ``DW_AT_location``
> + attribute that includes a ``DW_OP_LLVM_form_aspace_address`` operation.
> +
> +The name index must contain an entry for each debugging information entry that
> +defines a named subprogram, label, variable, type, or namespace, subject to the
> +following rules:
> +
> +* ``DW_TAG_variable`` debugging information entries with a ``DW_AT_location``
> + attribute that includes a ``DW_OP_addr``, ``DW_OP_LLVM_form_aspace_address``,
> + or ``DW_OP_form_tls_address`` operation are included; otherwise, they are
> + excluded.
> +
> +Data Representation of the Name Index
> +#####################################
> +
> +Section Header
> +^^^^^^^^^^^^^^
> +
> +.. note::
> +
> + The following provides an addition to DWARF Version 5 section 6.1.1.4.1 item
> + 14 ``augmentation_string``.
> +
> +A null-terminated UTF-8 vendor specific augmentation string, which provides
> +additional information about the contents of this index. If provided, the
> +recommended format for augmentation string is:
> +
> + | ``[``\ *vendor*\ ``v``\ *X*\ ``.``\ *Y*\ [\ ``:``\ *options*\ ]\ ``]``\ *
> +
> +Where *vendor* is the producer, ``vX.Y`` specifies the major X and minor Y
> +version number of the extensions used in the DWARF of the compilation unit, and
> +*options* is an optional string providing additional information about the
> +extensions. The version number must conform to semantic versioning [:ref:`SEMVER
> +<amdgpu-dwarf-SEMVER>`]. The *options* string must not contain the "\ ``]``\ "
> +character.
> +
> +For example:
> +
> + ::
> +
> + [abc:v0.0][def:v1.2:feature-a=on,feature-b=3]
> +
> +.. note::
> +
> + This is
> diff erent to the definition in DWARF Version 5 but is consistent with
> + the other augmentation strings and allows multiple vendor extensions to be
> + supported.
> +
> +.. _amdgpu-dwarf-line-number-information:
> +
> +Line Number Information
> +~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The Line Number Program Header
> +++++++++++++++++++++++++++++++
> +
> +Standard Content Descriptions
> +#############################
> +
> +.. note::
> +
> + This augments DWARF Version 5 section 6.2.4.1.
> +
> +.. _amdgpu-dwarf-line-number-information-dw-lnct-llvm-source:
> +
> +1. ``DW_LNCT_LLVM_source``
> +
> + The component is a null-terminated UTF-8 source text string with "\ ``\n``\
> + " line endings. This content code is paired with the same forms as
> + ``DW_LNCT_path``. It can be used for file name entries.
> +
> + The value is an empty null-terminated string if no source is available. If
> + the source is available but is an empty file then the value is a
> + null-terminated single "\ ``\n``\ ".
> +
> + *When the source field is present, consumers can use the embedded source
> + instead of attempting to discover the source on disk using the file path
> + provided by the* ``DW_LNCT_path`` *field. When the source field is absent,
> + consumers can access the file to get the source text.*
> +
> + *This is particularly useful for programing languages that support runtime
> + compilation and runtime generation of source text. In these cases, the
> + source text does not reside in any permanent file. For example, the OpenCL
> + language [:ref:`OpenCL <amdgpu-dwarf-OpenCL>`] supports online compilation.*
> +
> +2. ``DW_LNCT_LLVM_is_MD5``
> +
> + ``DW_LNCT_LLVM_is_MD5`` indicates if the ``DW_LNCT_MD5`` content kind, if
> + present, is valid: when 0 it is not valid and when 1 it is valid. If
> + ``DW_LNCT_LLVM_is_MD5`` content kind is not present, and ``DW_LNCT_MD5``
> + content kind is present, then the MD5 checksum is valid.
> +
> + ``DW_LNCT_LLVM_is_MD5`` is always paired with the ``DW_FORM_udata`` form.
> +
> + *This allows a compilation unit to have a mixture of files with and without
> + MD5 checksums. This can happen when multiple relocatable files are linked
> + together.*
> +
> +.. _amdgpu-dwarf-call-frame-information:
> +
> +Call Frame Information
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +.. note::
> +
> + This section provides changes to existing Call Frame Information and defines
> + instructions added by the proposal. Additional support is added for address
> + spaces. Register unwind DWARF expressions are generalized to allow any
> + location description, including those with composite and implicit location
> + descriptions.
> +
> + These changes would be incorporated into the DWARF Version 5 section 6.1.
> +
> +Structure of Call Frame Information
> ++++++++++++++++++++++++++++++++++++
> +
> +The register rules are:
> +
> +*undefined*
> + A register that has this rule has no recoverable value in the previous frame.
> + (By convention, it is not preserved by a callee.)
> +
> +*same value*
> + This register has not been modified from the previous frame. (By convention,
> + it is preserved by the callee, but the callee has not modified it.)
> +
> +*offset(N)*
> + N is a signed byte offset. The previous value of this register is saved at the
> + location description computed as if the DWARF operation expression
> + ``DW_OP_LLVM_offset N`` is evaluated as a location description with an initial
> + stack comprising the location description of the current CFA (see
> + :ref:`amdgpu-dwarf-operation-expressions`).
> +
> +*val_offset(N)*
> + N is a signed byte offset. The previous value of this register is the memory
> + byte address of the location description computed as if the DWARF operation
> + expression ``DW_OP_LLVM_offset N`` is evaluated as a location description with
> + an initial stack comprising the location description of the current CFA (see
> + :ref:`amdgpu-dwarf-operation-expressions`).
> +
> + The DWARF is ill-formed if the CFA location description is not a memory byte
> + address location description, or if the register size does not match the size
> + of an address in the address space of the current CFA location description.
> +
> + *Since the CFA location description is required to be a memory byte address
> + location description, the value of val_offset(N) will also be a memory byte
> + address location description since it is offsetting the CFA location
> + description by N bytes. Furthermore, the value of val_offset(N) will be a
> + memory byte address in the same address space as the CFA location
> + description.*
> +
> + .. note::
> +
> + Should DWARF allow the address size to be a
> diff erent size to the size of
> + the register? Requiring them to be the same bit size avoids any issue of
> + conversion as the bit contents of the register is simply interpreted as a
> + value of the address.
> +
> + GDB has a per register hook that allows a target specific conversion on a
> + register by register basis. It defaults to truncation of bigger registers,
> + and to actually reading bytes from the next register (or reads out of bounds
> + for the last register) for smaller registers. There are no GDB tests that
> + read a register out of bounds (except an illegal hand written assembly
> + test).
> +
> +*register(R)*
> + The previous value of this register is stored in another register numbered R.
> +
> + The DWARF is ill-formed if the register sizes do not match.
> +
> +*expression(E)*
> + The previous value of this register is located at the location description
> + produced by evaluating the DWARF operation expression E (see
> + :ref:`amdgpu-dwarf-operation-expressions`).
> +
> + E is evaluated as a location description in the context of the current
> + subprogram, current program location, and with an initial stack comprising the
> + location description of the current CFA.
> +
> +*val_expression(E)*
> + The previous value of this register is the value produced by evaluating the
> + DWARF operation expression E (see :ref:`amdgpu-dwarf-operation-expressions`).
> +
> + E is evaluated as a value in the context of the current subprogram, current
> + program location, and with an initial stack comprising the location
> + description of the current CFA.
> +
> + The DWARF is ill-formed if the resulting value type size does not match the
> + register size.
> +
> + .. note::
> +
> + This has limited usefulness as the DWARF expression E can only produce
> + values up to the size of the generic type. This is due to not allowing any
> + operations that specify a type in a CFI operation expression. This makes it
> + unusable for registers that are larger than the generic type. However,
> + *expression(E)* can be used to create an implicit location description of
> + any size.
> +
> +*architectural*
> + The rule is defined externally to this specification by the augmenter.
> +
> +A Common Information Entry holds information that is shared among many Frame
> +Description Entries. There is at least one CIE in every non-empty
> +``.debug_frame`` section. A CIE contains the following fields, in order:
> +
> +1. ``length`` (initial length)
> +
> + A constant that gives the number of bytes of the CIE structure, not
> + including the length field itself. The size of the length field plus the
> + value of length must be an integral multiple of the address size specified
> + in the ``address_size`` field.
> +
> +2. ``CIE_id`` (4 or 8 bytes, see
> + :ref:`amdgpu-dwarf-32-bit-and-64-bit-dwarf-formats`)
> +
> + A constant that is used to distinguish CIEs from FDEs.
> +
> + In the 32-bit DWARF format, the value of the CIE id in the CIE header is
> + 0xffffffff; in the 64-bit DWARF format, the value is 0xffffffffffffffff.
> +
> +3. ``version`` (ubyte)
> +
> + A version number. This number is specific to the call frame information and
> + is independent of the DWARF version number.
> +
> + The value of the CIE version number is 4.
> +
> + .. note::
> +
> + Would this be increased to 5 to reflect the changes in the proposal?
> +
> +4. ``augmentation`` (sequence of UTF-8 characters)
> +
> + A null-terminated UTF-8 string that identifies the augmentation to this CIE
> + or to the FDEs that use it. If a reader encounters an augmentation string
> + that is unexpected, then only the following fields can be read:
> +
> + * CIE: length, CIE_id, version, augmentation
> + * FDE: length, CIE_pointer, initial_location, address_range
> +
> + If there is no augmentation, this value is a zero byte.
> +
> + *The augmentation string allows users to indicate that there is additional
> + vendor and target architecture specific information in the CIE or FDE which
> + is needed to virtually unwind a stack frame. For example, this might be
> + information about dynamically allocated data which needs to be freed on exit
> + from the routine.*
> +
> + *Because the* ``.debug_frame`` *section is useful independently of any*
> + ``.debug_info`` *section, the augmentation string always uses UTF-8
> + encoding.*
> +
> + The recommended format for the augmentation string is:
> +
> + | ``[``\ *vendor*\ ``v``\ *X*\ ``.``\ *Y*\ [\ ``:``\ *options*\ ]\ ``]``\ *
> +
> + Where *vendor* is the producer, ``vX.Y`` specifies the major X and minor Y
> + version number of the extensions used, and *options* is an optional string
> + providing additional information about the extensions. The version number
> + must conform to semantic versioning [:ref:`SEMVER <amdgpu-dwarf-SEMVER>`].
> + The *options* string must not contain the "\ ``]``\ " character.
> +
> + For example:
> +
> + ::
> +
> + [abc:v0.0][def:v1.2:feature-a=on,feature-b=3]
> +
> +5. ``address_size`` (ubyte)
> +
> + The size of a target address in this CIE and any FDEs that use it, in bytes.
> + If a compilation unit exists for this frame, its address size must match the
> + address size here.
> +
> +6. ``segment_selector_size`` (ubyte)
> +
> + The size of a segment selector in this CIE and any FDEs that use it, in
> + bytes.
> +
> +7. ``code_alignment_factor`` (unsigned LEB128)
> +
> + A constant that is factored out of all advance location instructions (see
> + :ref:`amdgpu-dwarf-row-creation-instructions`). The resulting value is
> + ``(operand * code_alignment_factor)``.
> +
> +8. ``data_alignment_factor`` (signed LEB128)
> +
> + A constant that is factored out of certain offset instructions (see
> + :ref:`amdgpu-dwarf-cfa-definition-instructions` and
> + :ref:`amdgpu-dwarf-register-rule-instructions`). The resulting value is
> + ``(operand * data_alignment_factor)``.
> +
> +9. ``return_address_register`` (unsigned LEB128)
> +
> + An unsigned LEB128 constant that indicates which column in the rule table
> + represents the return address of the subprogram. Note that this column might
> + not correspond to an actual machine register.
> +
> +10. ``initial_instructions`` (array of ubyte)
> +
> + A sequence of rules that are interpreted to create the initial setting of
> + each column in the table.
> +
> + The default rule for all columns before interpretation of the initial
> + instructions is the undefined rule. However, an ABI authoring body or a
> + compilation system authoring body may specify an alternate default value for
> + any or all columns.
> +
> +11. ``padding`` (array of ubyte)
> +
> + Enough ``DW_CFA_nop`` instructions to make the size of this entry match the
> + length value above.
> +
> +An FDE contains the following fields, in order:
> +
> +1. ``length`` (initial length)
> +
> + A constant that gives the number of bytes of the header and instruction
> + stream for this subprogram, not including the length field itself. The size
> + of the length field plus the value of length must be an integral multiple of
> + the address size.
> +
> +2. ``CIE_pointer`` (4 or 8 bytes, see
> + :ref:`amdgpu-dwarf-32-bit-and-64-bit-dwarf-formats`)
> +
> + A constant offset into the ``.debug_frame`` section that denotes the CIE
> + that is associated with this FDE.
> +
> +3. ``initial_location`` (segment selector and target address)
> +
> + The address of the first location associated with this table entry. If the
> + segment_selector_size field of this FDE’s CIE is non-zero, the initial
> + location is preceded by a segment selector of the given length.
> +
> +4. ``address_range`` (target address)
> +
> + The number of bytes of program instructions described by this entry.
> +
> +5. ``instructions`` (array of ubyte)
> +
> + A sequence of table defining instructions that are described in
> + :ref:`amdgpu-dwarf-call-frame-instructions`.
> +
> +6. ``padding`` (array of ubyte)
> +
> + Enough ``DW_CFA_nop`` instructions to make the size of this entry match the
> + length value above.
> +
> +.. _amdgpu-dwarf-call-frame-instructions:
> +
> +Call Frame Instructions
> ++++++++++++++++++++++++
> +
> +Some call frame instructions have operands that are encoded as DWARF operation
> +expressions E (see :ref:`amdgpu-dwarf-operation-expressions`). The DWARF
> +operations that can be used in E have the following restrictions:
> +
> +* ``DW_OP_addrx``, ``DW_OP_call2``, ``DW_OP_call4``, ``DW_OP_call_ref``,
> + ``DW_OP_const_type``, ``DW_OP_constx``, ``DW_OP_convert``,
> + ``DW_OP_deref_type``, ``DW_OP_fbreg``, ``DW_OP_implicit_pointer``,
> + ``DW_OP_regval_type``, ``DW_OP_reinterpret``, and ``DW_OP_xderef_type``
> + operations are not allowed because the call frame information must not depend
> + on other debug sections.
> +
> +* ``DW_OP_push_object_address`` is not allowed because there is no object
> + context to provide a value to push.
> +
> +* ``DW_OP_LLVM_push_lane`` is not allowed because the call frame instructions
> + describe the actions for the whole thread, not the lanes independently.
> +
> +* ``DW_OP_call_frame_cfa`` and ``DW_OP_entry_value`` are not allowed because
> + their use would be circular.
> +
> +* ``DW_OP_LLVM_call_frame_entry_reg`` is not allowed if evaluating E causes a
> + circular dependency between ``DW_OP_LLVM_call_frame_entry_reg`` operations.
> +
> + *For example, if a register R1 has a* ``DW_CFA_def_cfa_expression``
> + *instruction that evaluates a* ``DW_OP_LLVM_call_frame_entry_reg`` *operation
> + that specifies register R2, and register R2 has a*
> + ``DW_CFA_def_cfa_expression`` *instruction that that evaluates a*
> + ``DW_OP_LLVM_call_frame_entry_reg`` *operation that specifies register R1.*
> +
> +*Call frame instructions to which these restrictions apply include*
> +``DW_CFA_def_cfa_expression``\ *,* ``DW_CFA_expression``\ *, and*
> +``DW_CFA_val_expression``\ *.*
> +
> +.. _amdgpu-dwarf-row-creation-instructions:
> +
> +Row Creation Instructions
> +#########################
> +
> +.. note::
> +
> + These instructions are the same as in DWARF Version 5 section 6.4.2.1.
> +
> +.. _amdgpu-dwarf-cfa-definition-instructions:
> +
> +CFA Definition Instructions
> +###########################
> +
> +1. ``DW_CFA_def_cfa``
> +
> + The ``DW_CFA_def_cfa`` instruction takes two unsigned LEB128 operands
> + representing a register number R and a (non-factored) byte displacement B.
> + AS is set to the target architecture default address space identifier. The
> + required action is to define the current CFA rule to be the result of
> + evaluating the DWARF operation expression ``DW_OP_constu AS;
> + DW_OP_aspace_bregx R, B`` as a location description.
> +
> +2. ``DW_CFA_def_cfa_sf``
> +
> + The ``DW_CFA_def_cfa_sf`` instruction takes two operands: an unsigned LEB128
> + value representing a register number R and a signed LEB128 factored byte
> + displacement B. AS is set to the target architecture default address space
> + identifier. The required action is to define the current CFA rule to be the
> + result of evaluating the DWARF operation expression ``DW_OP_constu AS;
> + DW_OP_aspace_bregx R, B*data_alignment_factor`` as a location description.
> +
> + *The action is the same as* ``DW_CFA_def_cfa`` *except that the second
> + operand is signed and factored.*
> +
> +3. ``DW_CFA_def_aspace_cfa`` *New*
> +
> + The ``DW_CFA_def_aspace_cfa`` instruction takes three unsigned LEB128
> + operands representing a register number R, a (non-factored) byte
> + displacement B, and a target architecture specific address space identifier
> + AS. The required action is to define the current CFA rule to be the result
> + of evaluating the DWARF operation expression ``DW_OP_constu AS;
> + DW_OP_aspace_bregx R, B`` as a location description.
> +
> + If AS is not one of the values defined by the target architecture specific
> + ``DW_ASPACE_*`` values then the DWARF expression is ill-formed.
> +
> +4. ``DW_CFA_def_aspace_cfa_sf`` *New*
> +
> + The ``DW_CFA_def_cfa_sf`` instruction takes three operands: an unsigned
> + LEB128 value representing a register number R, a signed LEB128 factored byte
> + displacement B, and an unsigned LEB128 value representing a target
> + architecture specific address space identifier AS. The required action is to
> + define the current CFA rule to be the result of evaluating the DWARF
> + operation expression ``DW_OP_constu AS; DW_OP_aspace_bregx R,
> + B*data_alignment_factor`` as a location description.
> +
> + If AS is not one of the values defined by the target architecture specific
> + ``DW_ASPACE_*`` values, then the DWARF expression is ill-formed.
> +
> + *The action is the same as* ``DW_CFA_aspace_def_cfa`` *except that the
> + second operand is signed and factored.*
> +
> +5. ``DW_CFA_def_cfa_register``
> +
> + The ``DW_CFA_def_cfa_register`` instruction takes a single unsigned LEB128
> + operand representing a register number R. The required action is to define
> + the current CFA rule to be the result of evaluating the DWARF operation
> + expression ``DW_OP_constu AS; DW_OP_aspace_bregx R, B`` as a location
> + description. B and AS are the old CFA byte displacement and address space
> + respectively.
> +
> + If the subprogram has no current CFA rule, or the rule was defined by a
> + ``DW_CFA_def_cfa_expression`` instruction, then the DWARF is ill-formed.
> +
> +6. ``DW_CFA_def_cfa_offset``
> +
> + The ``DW_CFA_def_cfa_offset`` instruction takes a single unsigned LEB128
> + operand representing a (non-factored) byte displacement B. The required
> + action is to define the current CFA rule to be the result of evaluating the
> + DWARF operation expression ``DW_OP_constu AS; DW_OP_aspace_bregx R, B`` as a
> + location description. R and AS are the old CFA register number and address
> + space respectively.
> +
> + If the subprogram has no current CFA rule, or the rule was defined by a
> + ``DW_CFA_def_cfa_expression`` instruction, then the DWARF is ill-formed.
> +
> +7. ``DW_CFA_def_cfa_offset_sf``
> +
> + The ``DW_CFA_def_cfa_offset_sf`` instruction takes a signed LEB128 operand
> + representing a factored byte displacement B. The required action is to
> + define the current CFA rule to be the result of evaluating the DWARF
> + operation expression ``DW_OP_constu AS; DW_OP_aspace_bregx R,
> + B*data_alignment_factor`` as a location description. R and AS are the old
> + CFA register number and address space respectively.
> +
> + If the subprogram has no current CFA rule, or the rule was defined by a
> + ``DW_CFA_def_cfa_expression`` instruction, then the DWARF is ill-formed.
> +
> + *The action is the same as* ``DW_CFA_def_cfa_offset`` *except that the
> + operand is signed and factored.*
> +
> +8. ``DW_CFA_def_cfa_expression``
> +
> + The ``DW_CFA_def_cfa_expression`` instruction takes a single operand encoded
> + as a ``DW_FORM_exprloc`` value representing a DWARF operation expression E.
> + The required action is to define the current CFA rule to be the result of
> + evaluating E as a location description in the context of the current
> + subprogram, current program location, and an empty initial stack.
> +
> + *See* :ref:`amdgpu-dwarf-call-frame-instructions` *regarding restrictions on
> + the DWARF expression operations that can be used in E.*
> +
> + The DWARF is ill-formed if the result of evaluating E is not a memory byte
> + address location description.
> +
> +.. _amdgpu-dwarf-register-rule-instructions:
> +
> +Register Rule Instructions
> +##########################
> +
> +1. ``DW_CFA_undefined``
> +
> + The ``DW_CFA_undefined`` instruction takes a single unsigned LEB128 operand
> + that represents a register number R. The required action is to set the rule
> + for the register specified by R to ``undefined``.
> +
> +2. ``DW_CFA_same_value``
> +
> + The ``DW_CFA_same_value`` instruction takes a single unsigned LEB128 operand
> + that represents a register number R. The required action is to set the rule
> + for the register specified by R to ``same value``.
> +
> +3. ``DW_CFA_offset``
> +
> + The ``DW_CFA_offset`` instruction takes two operands: a register number R
> + (encoded with the opcode) and an unsigned LEB128 constant representing a
> + factored displacement B. The required action is to change the rule for the
> + register specified by R to be an *offset(B\*data_alignment_factor)* rule.
> +
> + .. note::
> +
> + Seems this should be named ``DW_CFA_offset_uf`` since the offset is
> + unsigned factored.
> +
> +4. ``DW_CFA_offset_extended``
> +
> + The ``DW_CFA_offset_extended`` instruction takes two unsigned LEB128
> + operands representing a register number R and a factored displacement B.
> + This instruction is identical to ``DW_CFA_offset`` except for the encoding
> + and size of the register operand.
> +
> + .. note::
> +
> + Seems this should be named ``DW_CFA_offset_extended_uf`` since the
> + displacement is unsigned factored.
> +
> +5. ``DW_CFA_offset_extended_sf``
> +
> + The ``DW_CFA_offset_extended_sf`` instruction takes two operands: an
> + unsigned LEB128 value representing a register number R and a signed LEB128
> + factored displacement B. This instruction is identical to
> + ``DW_CFA_offset_extended`` except that B is signed.
> +
> +6. ``DW_CFA_val_offset``
> +
> + The ``DW_CFA_val_offset`` instruction takes two unsigned LEB128 operands
> + representing a register number R and a factored displacement B. The required
> + action is to change the rule for the register indicated by R to be a
> + *val_offset(B\*data_alignment_factor)* rule.
> +
> + .. note::
> +
> + Seems this should be named ``DW_CFA_val_offset_uf`` since the displacement
> + is unsigned factored.
> +
> + .. note::
> +
> + An alternative is to define ``DW_CFA_val_offset`` to implicitly use the
> + target architecture default address space, and add another operation that
> + specifies the address space.
> +
> +7. ``DW_CFA_val_offset_sf``
> +
> + The ``DW_CFA_val_offset_sf`` instruction takes two operands: an unsigned
> + LEB128 value representing a register number R and a signed LEB128 factored
> + displacement B. This instruction is identical to ``DW_CFA_val_offset``
> + except that B is signed.
> +
> +8. ``DW_CFA_register``
> +
> + The ``DW_CFA_register`` instruction takes two unsigned LEB128 operands
> + representing register numbers R1 and R2 respectively. The required action is
> + to set the rule for the register specified by R1 to be a *register(R2)* rule.
> +
> +9. ``DW_CFA_expression``
> +
> + The ``DW_CFA_expression`` instruction takes two operands: an unsigned LEB128
> + value representing a register number R, and a ``DW_FORM_block`` value
> + representing a DWARF operation expression E. The required action is to
> + change the rule for the register specified by R to be an *expression(E)*
> + rule.
> +
> + *That is, E computes the location description where the register value can
> + be retrieved.*
> +
> + *See* :ref:`amdgpu-dwarf-call-frame-instructions` *regarding restrictions on
> + the DWARF expression operations that can be used in E.*
> +
> +10. ``DW_CFA_val_expression``
> +
> + The ``DW_CFA_val_expression`` instruction takes two operands: an unsigned
> + LEB128 value representing a register number R, and a ``DW_FORM_block`` value
> + representing a DWARF operation expression E. The required action is to
> + change the rule for the register specified by R to be a *val_expression(E)*
> + rule.
> +
> + *That is, E computes the value of register R.*
> +
> + *See* :ref:`amdgpu-dwarf-call-frame-instructions` *regarding restrictions on
> + the DWARF expression operations that can be used in E.*
> +
> + If the result of evaluating E is not a value with a base type size that
> + matches the register size, then the DWARF is ill-formed.
> +
> +11. ``DW_CFA_restore``
> +
> + The ``DW_CFA_restore`` instruction takes a single operand (encoded with the
> + opcode) that represents a register number R. The required action is to
> + change the rule for the register specified by R to the rule assigned it by
> + the ``initial_instructions`` in the CIE.
> +
> +12. ``DW_CFA_restore_extended``
> +
> + The ``DW_CFA_restore_extended`` instruction takes a single unsigned LEB128
> + operand that represents a register number R. This instruction is identical
> + to ``DW_CFA_restore`` except for the encoding and size of the register
> + operand.
> +
> +Row State Instructions
> +######################
> +
> +.. note::
> +
> + These instructions are the same as in DWARF Version 5 section 6.4.2.4.
> +
> +Padding Instruction
> +###################
> +
> +.. note::
> +
> + These instructions are the same as in DWARF Version 5 section 6.4.2.5.
> +
> +Call Frame Instruction Usage
> +++++++++++++++++++++++++++++
> +
> +.. note::
> +
> + The same as in DWARF Version 5 section 6.4.3.
> +
> +.. _amdgpu-dwarf-call-frame-calling-address:
> +
> +Call Frame Calling Address
> +++++++++++++++++++++++++++
> +
> +.. note::
> +
> + The same as in DWARF Version 5 section 6.4.4.
> +
> +Data Representation
> +-------------------
> +
> +.. _amdgpu-dwarf-32-bit-and-64-bit-dwarf-formats:
> +
> +32-Bit and 64-Bit DWARF Formats
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +.. note::
> +
> + This augments DWARF Version 5 section 7.4.
> +
> +1. Within the body of the ``.debug_info`` section, certain forms of attribute
> + value depend on the choice of DWARF format as follows. For the 32-bit DWARF
> + format, the value is a 4-byte unsigned integer; for the 64-bit DWARF format,
> + the value is an 8-byte unsigned integer.
> +
> + .. table:: ``.debug_info`` section attribute form roles
> + :name: amdgpu-dwarf-debug-info-section-attribute-form-roles-table
> +
> + ================================== ===================================
> + Form Role
> + ================================== ===================================
> + DW_FORM_line_strp offset in ``.debug_line_str``
> + DW_FORM_ref_addr offset in ``.debug_info``
> + DW_FORM_sec_offset offset in a section other than
> + ``.debug_info`` or ``.debug_str``
> + DW_FORM_strp offset in ``.debug_str``
> + DW_FORM_strp_sup offset in ``.debug_str`` section of
> + supplementary object file
> + DW_OP_call_ref offset in ``.debug_info``
> + DW_OP_implicit_pointer offset in ``.debug_info``
> + DW_OP_LLVM_aspace_implicit_pointer offset in ``.debug_info``
> + ================================== ===================================
> +
> +Format of Debugging Information
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Attribute Encodings
> ++++++++++++++++++++
> +
> +.. note::
> +
> + This augments DWARF Version 5 section 7.5.4 and Table 7.5.
> +
> +The following table gives the encoding of the additional debugging information
> +entry attributes.
> +
> +.. table:: Attribute encodings
> + :name: amdgpu-dwarf-attribute-encodings-table
> +
> + ================================== ===== ====================================
> + Attribute Name Value Classes
> + ================================== ===== ====================================
> + DW_AT_LLVM_active_lane *TBD* exprloc, loclist
> + DW_AT_LLVM_augmentation *TBD* string
> + DW_AT_LLVM_lanes *TBD* constant
> + DW_AT_LLVM_lane_pc *TBD* exprloc, loclist
> + DW_AT_LLVM_vector_size *TBD* constant
> + ================================== ===== ====================================
> +
> +DWARF Expressions
> +~~~~~~~~~~~~~~~~~
> +
> +.. note::
> +
> + Rename DWARF Version 5 section 7.7 to reflect the unification of location
> + descriptions into DWARF expressions.
> +
> +Operation Expressions
> ++++++++++++++++++++++
> +
> +.. note::
> +
> + Rename DWARF Version 5 section 7.7.1 and delete section 7.7.2 to reflect the
> + unification of location descriptions into DWARF expressions.
> +
> + This augments DWARF Version 5 section 7.7.1 and Table 7.9.
> +
> +The following table gives the encoding of the additional DWARF expression
> +operations.
> +
> +.. table:: DWARF Operation Encodings
> + :name: amdgpu-dwarf-operation-encodings-table
> +
> + ================================== ===== ======== ===============================
> + Operation Code Number Notes
> + of
> + Operands
> + ================================== ===== ======== ===============================
> + DW_OP_LLVM_form_aspace_address 0xe1 0
> + DW_OP_LLVM_push_lane 0xe2 0
> + DW_OP_LLVM_offset 0xe3 0
> + DW_OP_LLVM_offset_constu 0xe4 1 ULEB128 byte displacement
> + DW_OP_LLVM_bit_offset 0xe5 0
> + DW_OP_LLVM_call_frame_entry_reg 0xe6 1 ULEB128 register number
> + DW_OP_LLVM_undefined 0xe7 0
> + DW_OP_LLVM_aspace_bregx 0xe8 2 ULEB128 register number,
> + ULEB128 byte displacement
> + DW_OP_LLVM_aspace_implicit_pointer 0xe9 2 4- or 8-byte offset of DIE,
> + SLEB128 byte displacement
> + DW_OP_LLVM_piece_end 0xea 0
> + DW_OP_LLVM_extend 0xeb 2 ULEB128 bit size,
> + ULEB128 count
> + DW_OP_LLVM_select_bit_piece 0xec 2 ULEB128 bit size,
> + ULEB128 count
> + ================================== ===== ======== ===============================
> +
> +Location List Expressions
> ++++++++++++++++++++++++++
> +
> +.. note::
> +
> + Rename DWARF Version 5 section 7.7.3 to reflect that location lists are a kind
> + of DWARF expression.
> +
> +Source Languages
> +~~~~~~~~~~~~~~~~
> +
> +.. note::
> +
> + This augments DWARF Version 5 section 7.12 and Table 7.17.
> +
> +The following table gives the encoding of the additional DWARF languages.
> +
> +.. table:: Language encodings
> + :name: amdgpu-dwarf-language-encodings-table
> +
> + ==================== ====== ===================
> + Language Name Value Default Lower Bound
> + ==================== ====== ===================
> + ``DW_LANG_LLVM_HIP`` 0x8100 0
> + ==================== ====== ===================
> +
> +Address Class and Address Space Encodings
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +.. note::
> +
> + This replaces DWARF Version 5 section 7.13.
> +
> +The encodings of the constants used for the currently defined address classes
> +are given in :ref:`amdgpu-dwarf-address-class-encodings-table`.
> +
> +.. table:: Address class encodings
> + :name: amdgpu-dwarf-address-class-encodings-table
> +
> + ========================== ======
> + Address Class Name Value
> + ========================== ======
> + ``DW_ADDR_none`` 0x0000
> + ``DW_ADDR_LLVM_global`` 0x0001
> + ``DW_ADDR_LLVM_constant`` 0x0002
> + ``DW_ADDR_LLVM_group`` 0x0003
> + ``DW_ADDR_LLVM_private`` 0x0004
> + ``DW_ADDR_LLVM_lo_user`` 0x8000
> + ``DW_ADDR_LLVM_hi_user`` 0xffff
> + ========================== ======
> +
> +Line Number Information
> +~~~~~~~~~~~~~~~~~~~~~~~
> +
> +.. note::
> +
> + This augments DWARF Version 5 section 7.22 and Table 7.27.
> +
> +The following table gives the encoding of the additional line number header
> +entry formats.
> +
> +.. table:: Line number header entry format encodings
> + :name: amdgpu-dwarf-line-number-header-entry-format-encodings-table
> +
> + ==================================== ====================
> + Line number header entry format name Value
> + ==================================== ====================
> + ``DW_LNCT_LLVM_source`` 0x2001
> + ``DW_LNCT_LLVM_is_MD5`` 0x2002
> + ==================================== ====================
> +
> +Call Frame Information
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +.. note::
> +
> + This augments DWARF Version 5 section 7.24 and Table 7.29.
> +
> +The following table gives the encoding of the additional call frame information
> +instructions.
> +
> +.. table:: Call frame instruction encodings
> + :name: amdgpu-dwarf-call-frame-instruction-encodings-table
> +
> + ======================== ====== ====== ================ ================ ================
> + Instruction High 2 Low 6 Operand 1 Operand 2 Operand 3
> + Bits Bits
> + ======================== ====== ====== ================ ================ ================
> + DW_CFA_def_aspace_cfa 0 0x2f ULEB128 register ULEB128 offset ULEB128 address space
> + DW_CFA_def_aspace_cfa_sf 0 0x30 ULEB128 register SLEB128 offset ULEB128 address space
> + ======================== ====== ====== ================ ================ ================
> +
> +Attributes by Tag Value (Informative)
> +-------------------------------------
> +
> +.. note::
> +
> + This augments DWARF Version 5 Appendix A and Table A.1.
> +
> +The following table provides the additional attributes that are applicable to
> +debugger information entries.
> +
> +.. table:: Attributes by tag value
> + :name: amdgpu-dwarf-attributes-by-tag-value-table
> +
> + ============================= =============================
> + Tag Name Applicable Attributes
> + ============================= =============================
> + ``DW_TAG_base_type`` * ``DW_AT_LLVM_vector_size``
> + ``DW_TAG_compile_unit`` * ``DW_AT_LLVM_augmentation``
> + ``DW_TAG_entry_point`` * ``DW_AT_LLVM_active_lane``
> + * ``DW_AT_LLVM_lane_pc``
> + * ``DW_AT_LLVM_lanes``
> + ``DW_TAG_inlined_subroutine`` * ``DW_AT_LLVM_active_lane``
> + * ``DW_AT_LLVM_lane_pc``
> + * ``DW_AT_LLVM_lanes``
> + ``DW_TAG_subprogram`` * ``DW_AT_LLVM_active_lane``
> + * ``DW_AT_LLVM_lane_pc``
> + * ``DW_AT_LLVM_lanes``
> + ============================= =============================
> +
> +References
> +----------
> +
> + .. _amdgpu-dwarf-AMD:
> +
> +1. [AMD] `Advanced Micro Devices <https://www.amd.com/>`__
> +
> + .. _amdgpu-dwarf-AMD-ROCm:
> +
> +2. [AMD-ROCm] `AMD ROCm Platform <https://rocm-documentation.readthedocs.io>`__
> +
> + .. _amdgpu-dwarf-AMD-ROCgdb:
> +
> +3. [AMD-ROCgdb] `AMD ROCm Debugger (ROCgdb) <https://github.com/ROCm-Developer-Tools/ROCgdb>`__
> +
> + .. _amdgpu-dwarf-AMDGPU-LLVM:
> +
> +4. [AMDGPU-LLVM] `User Guide for AMDGPU LLVM Backend <https://llvm.org/docs/AMDGPUUsage.html>`__
> +
> + .. _amdgpu-dwarf-CUDA:
> +
> +5. [CUDA] `Nvidia CUDA Language <https://docs.nvidia.com/cuda/cuda-c-programming-guide/>`__
> +
> + .. _amdgpu-dwarf-DWARF:
> +
> +6. [DWARF] `DWARF Debugging Information Format <http://dwarfstd.org/>`__
> +
> + .. _amdgpu-dwarf-ELF:
> +
> +7. [ELF] `Executable and Linkable Format (ELF) <http://www.sco.com/developers/gabi/>`__
> +
> + .. _amdgpu-dwarf-GCC:
> +
> +8. [GCC] `GCC: The GNU Compiler Collection <https://www.gnu.org/software/gcc/>`__
> +
> + .. _amdgpu-dwarf-GDB:
> +
> +9. [GDB] `GDB: The GNU Project Debugger <https://www.gnu.org/software/gdb/>`__
> +
> + .. _amdgpu-dwarf-HIP:
> +
> +10. [HIP] `HIP Programming Guide <https://rocm-documentation.readthedocs.io/en/latest/Programming_Guides/Programming-Guides.html#hip-programing-guide>`__
> +
> + .. _amdgpu-dwarf-HSA:
> +
> +11. [HSA] `Heterogeneous System Architecture (HSA) Foundation <http://www.hsafoundation.com/>`__
> +
> + .. _amdgpu-dwarf-LLVM:
> +
> +12. [LLVM] `The LLVM Compiler Infrastructure <https://llvm.org/>`__
> +
> + .. _amdgpu-dwarf-OpenCL:
> +
> +13. [OpenCL] `The OpenCL Specification Version 2.0 <http://www.khronos.org/registry/cl/specs/opencl-2.0.pdf>`__
> +
> + .. _amdgpu-dwarf-Perforce-TotalView:
> +
> +14. [Perforce-TotalView] `Perforce TotalView HPC Debugging Software <https://totalview.io/products/totalview>`__
> +
> + .. _amdgpu-dwarf-SEMVER:
> +
> +15. [SEMVER] `Semantic Versioning <https://semver.org/>`__
> \ No newline at end of file
>
> diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
> index 7f7bd17bbf10..aafe97d1c595 100644
> --- a/llvm/docs/AMDGPUUsage.rst
> +++ b/llvm/docs/AMDGPUUsage.rst
> @@ -5,6 +5,24 @@ User Guide for AMDGPU Backend
> .. contents::
> :local:
>
> +.. toctree::
> + :hidden:
> +
> + AMDGPU/AMDGPUAsmGFX7
> + AMDGPU/AMDGPUAsmGFX8
> + AMDGPU/AMDGPUAsmGFX9
> + AMDGPU/AMDGPUAsmGFX900
> + AMDGPU/AMDGPUAsmGFX904
> + AMDGPU/AMDGPUAsmGFX906
> + AMDGPU/AMDGPUAsmGFX908
> + AMDGPU/AMDGPUAsmGFX10
> + AMDGPU/AMDGPUAsmGFX1011
> + AMDGPUModifierSyntax
> + AMDGPUOperandSyntax
> + AMDGPUInstructionSyntax
> + AMDGPUInstructionNotation
> + AMDGPUDwarfProposalForHeterogeneousDebugging
> +
> Introduction
> ============
>
> @@ -824,3959 +842,258 @@ if needed.
>
> ``.debug``\ *\**
> The standard DWARF sections. See :ref:`amdgpu-dwarf-debug-information` for
> - information on the DWARF produced by the AMDGPU backend.
> -
> -``.dynamic``, ``.dynstr``, ``.dynsym``, ``.hash``
> - The standard sections used by a dynamic loader.
> -
> -``.note``
> - See :ref:`amdgpu-note-records` for the note records supported by the AMDGPU
> - backend.
> -
> -``.rela``\ *name*, ``.rela.dyn``
> - For relocatable code objects, *name* is the name of the section that the
> - relocation records apply. For example, ``.rela.text`` is the section name for
> - relocation records associated with the ``.text`` section.
> -
> - For linked shared code objects, ``.rela.dyn`` contains all the relocation
> - records from each of the relocatable code object's ``.rela``\ *name* sections.
> -
> - See :ref:`amdgpu-relocation-records` for the relocation records supported by
> - the AMDGPU backend.
> -
> -``.text``
> - The executable machine code for the kernels and functions they call. Generated
> - as position independent code. See :ref:`amdgpu-code-conventions` for
> - information on conventions used in the isa generation.
> -
> -.. _amdgpu-note-records:
> -
> -Note Records
> -------------
> -
> -The AMDGPU backend code object contains ELF note records in the ``.note``
> -section. The set of generated notes and their semantics depend on the code
> -object version; see :ref:`amdgpu-note-records-v2` and
> -:ref:`amdgpu-note-records-v3`.
> -
> -As required by ``ELFCLASS32`` and ``ELFCLASS64``, minimal zero-byte padding
> -must be generated after the ``name`` field to ensure the ``desc`` field is 4
> -byte aligned. In addition, minimal zero-byte padding must be generated to
> -ensure the ``desc`` field size is a multiple of 4 bytes. The ``sh_addralign``
> -field of the ``.note`` section must be at least 4 to indicate at least 8 byte
> -alignment.
> -
> -.. _amdgpu-note-records-v2:
> -
> -Code Object V2 Note Records (-mattr=-code-object-v3)
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> -
> -.. warning:: Code Object V2 is not the default code object version emitted by
> - this version of LLVM. For a description of the notes generated with the
> - default configuration (Code Object V3) see :ref:`amdgpu-note-records-v3`.
> -
> -The AMDGPU backend code object uses the following ELF note record in the
> -``.note`` section when compiling for Code Object V2 (-mattr=-code-object-v3).
> -
> -Additional note records may be present, but any which are not documented here
> -are deprecated and should not be used.
> -
> - .. table:: AMDGPU Code Object V2 ELF Note Records
> - :name: amdgpu-elf-note-records-table-v2
> -
> - ===== ============================== ======================================
> - Name Type Description
> - ===== ============================== ======================================
> - "AMD" ``NT_AMD_AMDGPU_HSA_METADATA`` <metadata null terminated string>
> - ===== ============================== ======================================
> -
> -..
> -
> - .. table:: AMDGPU Code Object V2 ELF Note Record Enumeration Values
> - :name: amdgpu-elf-note-record-enumeration-values-table-v2
> -
> - ============================== =====
> - Name Value
> - ============================== =====
> - *reserved* 0-9
> - ``NT_AMD_AMDGPU_HSA_METADATA`` 10
> - *reserved* 11
> - ============================== =====
> -
> -``NT_AMD_AMDGPU_HSA_METADATA``
> - Specifies extensible metadata associated with the code objects executed on HSA
> - [HSA]_ compatible runtimes such as AMD's ROCm [AMD-ROCm]_. It is required when
> - the target triple OS is ``amdhsa`` (see :ref:`amdgpu-target-triples`). See
> - :ref:`amdgpu-amdhsa-code-object-metadata-v2` for the syntax of the code
> - object metadata string.
> -
> -.. _amdgpu-note-records-v3:
> -
> -Code Object V3 Note Records (-mattr=+code-object-v3)
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> -
> -The AMDGPU backend code object uses the following ELF note record in the
> -``.note`` section when compiling for Code Object V3 (-mattr=+code-object-v3).
> -
> -Additional note records may be present, but any which are not documented here
> -are deprecated and should not be used.
> -
> - .. table:: AMDGPU Code Object V3 ELF Note Records
> - :name: amdgpu-elf-note-records-table-v3
> -
> - ======== ============================== ======================================
> - Name Type Description
> - ======== ============================== ======================================
> - "AMDGPU" ``NT_AMDGPU_METADATA`` Metadata in Message Pack [MsgPack]_
> - binary format.
> - ======== ============================== ======================================
> -
> -..
> -
> - .. table:: AMDGPU Code Object V3 ELF Note Record Enumeration Values
> - :name: amdgpu-elf-note-record-enumeration-values-table-v3
> -
> - ============================== =====
> - Name Value
> - ============================== =====
> - *reserved* 0-31
> - ``NT_AMDGPU_METADATA`` 32
> - ============================== =====
> -
> -``NT_AMDGPU_METADATA``
> - Specifies extensible metadata associated with an AMDGPU code
> - object. It is encoded as a map in the Message Pack [MsgPack]_ binary
> - data format. See :ref:`amdgpu-amdhsa-code-object-metadata-v3` for the
> - map keys defined for the ``amdhsa`` OS.
> -
> -.. _amdgpu-symbols:
> -
> -Symbols
> --------
> -
> -Symbols include the following:
> -
> - .. table:: AMDGPU ELF Symbols
> - :name: amdgpu-elf-symbols-table
> -
> - ===================== ================== ================ ==================
> - Name Type Section Description
> - ===================== ================== ================ ==================
> - *link-name* ``STT_OBJECT`` - ``.data`` Global variable
> - - ``.rodata``
> - - ``.bss``
> - *link-name*\ ``.kd`` ``STT_OBJECT`` - ``.rodata`` Kernel descriptor
> - *link-name* ``STT_FUNC`` - ``.text`` Kernel entry point
> - *link-name* ``STT_OBJECT`` - SHN_AMDGPU_LDS Global variable in LDS
> - ===================== ================== ================ ==================
> -
> -Global variable
> - Global variables both used and defined by the compilation unit.
> -
> - If the symbol is defined in the compilation unit then it is allocated in the
> - appropriate section according to if it has initialized data or is readonly.
> -
> - If the symbol is external then its section is ``STN_UNDEF`` and the loader
> - will resolve relocations using the definition provided by another code object
> - or explicitly defined by the runtime.
> -
> - If the symbol resides in local/group memory (LDS) then its section is the
> - special processor specific section name ``SHN_AMDGPU_LDS``, and the
> - ``st_value`` field describes alignment requirements as it does for common
> - symbols.
> -
> - .. TODO::
> -
> - Add description of linked shared object symbols. Seems undefined symbols
> - are marked as STT_NOTYPE.
> -
> -Kernel descriptor
> - Every HSA kernel has an associated kernel descriptor. It is the address of the
> - kernel descriptor that is used in the AQL dispatch packet used to invoke the
> - kernel, not the kernel entry point. The layout of the HSA kernel descriptor is
> - defined in :ref:`amdgpu-amdhsa-kernel-descriptor`.
> -
> -Kernel entry point
> - Every HSA kernel also has a symbol for its machine code entry point.
> -
> -.. _amdgpu-relocation-records:
> -
> -Relocation Records
> -------------------
> -
> -AMDGPU backend generates ``Elf64_Rela`` relocation records. Supported
> -relocatable fields are:
> -
> -``word32``
> - This specifies a 32-bit field occupying 4 bytes with arbitrary byte
> - alignment. These values use the same byte order as other word values in the
> - AMDGPU architecture.
> -
> -``word64``
> - This specifies a 64-bit field occupying 8 bytes with arbitrary byte
> - alignment. These values use the same byte order as other word values in the
> - AMDGPU architecture.
> -
> -Following notations are used for specifying relocation calculations:
> -
> -**A**
> - Represents the addend used to compute the value of the relocatable field.
> -
> -**G**
> - Represents the offset into the global offset table at which the relocation
> - entry's symbol will reside during execution.
> -
> -**GOT**
> - Represents the address of the global offset table.
> -
> -**P**
> - Represents the place (section offset for ``et_rel`` or address for ``et_dyn``)
> - of the storage unit being relocated (computed using ``r_offset``).
> -
> -**S**
> - Represents the value of the symbol whose index resides in the relocation
> - entry. Relocations not using this must specify a symbol index of
> - ``STN_UNDEF``.
> -
> -**B**
> - Represents the base address of a loaded executable or shared object which is
> - the
> diff erence between the ELF address and the actual load address.
> - Relocations using this are only valid in executable or shared objects.
> -
> -The following relocation types are supported:
> -
> - .. table:: AMDGPU ELF Relocation Records
> - :name: amdgpu-elf-relocation-records-table
> -
> - ========================== ======= ===== ========== ==============================
> - Relocation Type Kind Value Field Calculation
> - ========================== ======= ===== ========== ==============================
> - ``R_AMDGPU_NONE`` 0 *none* *none*
> - ``R_AMDGPU_ABS32_LO`` Static, 1 ``word32`` (S + A) & 0xFFFFFFFF
> - Dynamic
> - ``R_AMDGPU_ABS32_HI`` Static, 2 ``word32`` (S + A) >> 32
> - Dynamic
> - ``R_AMDGPU_ABS64`` Static, 3 ``word64`` S + A
> - Dynamic
> - ``R_AMDGPU_REL32`` Static 4 ``word32`` S + A - P
> - ``R_AMDGPU_REL64`` Static 5 ``word64`` S + A - P
> - ``R_AMDGPU_ABS32`` Static, 6 ``word32`` S + A
> - Dynamic
> - ``R_AMDGPU_GOTPCREL`` Static 7 ``word32`` G + GOT + A - P
> - ``R_AMDGPU_GOTPCREL32_LO`` Static 8 ``word32`` (G + GOT + A - P) & 0xFFFFFFFF
> - ``R_AMDGPU_GOTPCREL32_HI`` Static 9 ``word32`` (G + GOT + A - P) >> 32
> - ``R_AMDGPU_REL32_LO`` Static 10 ``word32`` (S + A - P) & 0xFFFFFFFF
> - ``R_AMDGPU_REL32_HI`` Static 11 ``word32`` (S + A - P) >> 32
> - *reserved* 12
> - ``R_AMDGPU_RELATIVE64`` Dynamic 13 ``word64`` B + A
> - ========================== ======= ===== ========== ==============================
> -
> -``R_AMDGPU_ABS32_LO`` and ``R_AMDGPU_ABS32_HI`` are only supported by
> -the ``mesa3d`` OS, which does not support ``R_AMDGPU_ABS64``.
> -
> -There is no current OS loader support for 32-bit programs and so
> -``R_AMDGPU_ABS32`` is not used.
> -
> -.. _amdgpu-dwarf-6-proposal-for-heterogeneous-debugging:
> -
> -DWARF Version 6 Proposal For Heterogeneous Debugging
> -====================================================
> -
> -.. warning::
> -
> - This section describes a **provisional proposal** for DWARF Version 6
> - [DWARF]_ to support heterogeneous debugging. It is not currently fully
> - implemented and is subject to change.
> -
> -.. note::
> -
> - This section proposes a set of backwards compatible extensions to DWARF
> - Version 5 [DWARF]_ for consideration of inclusion into a future DWARF Version
> - 6 standard to support heterogeneous debugging.
> -
> - The remainder of this note provides motivation for each proposed feature in
> - terms of heterogeneous debugging on commercially available AMD GPU hardware
> - (AMDGPU). However, the proposal is intended to be vendor and architecture
> - neutral. It is believed to apply to other heterogeous hardware devices
> - including GPUs, DSPs, FPGAs, and other specialized hardware. These
> - collectively include similar characteristics and requirements as AMDGPUs.
> - Parts of the proposal can also apply to traditional CPU hardware that supports
> - large vector registers. Compilers can map source languages and extensions that
> - describe large scale parallel execution onto the lanes of the vector
> - registers. This is common in programming languages used in ML and HPC. The
> - proposal also includes improved support for optimized code on any
> - architecture. Some of the generalizations may also benefit other issues that
> - have been raised.
> -
> - The proposal has evolved though collaboration with many individuals and active
> - prototyping within the gdb debugger and LLVM compiler. Input has also been
> - very much appreciated from the developers working on the Totalview debugger
> - and gcc compiler.
> -
> - The AMDGPU has several features that require additional DWARF functionality in
> - order to support optimized code.
> -
> - AMDGPU optimized code may spill vector registers to non-global address space
> - memory, and this spilling may be done only for lanes that are active on entry
> - to the subprogram. To support this, a location description that can be created
> - as a masked select is required. See ``DW_OP_LLVM_select_bit_piece``.
> -
> - Since the active lane mask may be held in a register, a way to get the value
> - of a register on entry to a subprogram is required. To support this an
> - operation that returns the caller value of a register as specified by the Call
> - Frame Information (CFI) is required. See ``DW_OP_LLVM_call_frame_entry_reg``
> - and :ref:`amdgpu-dwarf-call-frame-information`.
> -
> - Current DWARF uses an empty expression to indicate an undefined location
> - description. Since the masked select composite location description operation
> - takes more than one location description, it is necessary to have an explicit
> - way to specify an undefined location description. Otherwise it is not possible
> - to specify that a particular one of the input location descriptions is
> - undefined. See ``DW_OP_LLVM_undefined``.
> -
> - CFI describes restoring callee saved registers that are spilled. Currently CFI
> - only allows a location description that is a register, memory address, or
> - implicit location description. AMDGPU optimized code may spill scalar
> - registers into portions of vector registers. This requires extending CFI to
> - allow any location description. See
> - :ref:`amdgpu-dwarf-call-frame-information`.
> -
> - The vector registers of the AMDGPU are represented as their full wavefront
> - size, meaning the wavefront size times the dword size. This reflects the
> - actual hardware and allows the compiler to generate DWARF for languages that
> - map a thread to the complete wavefront. It also allows more efficient DWARF to
> - be generated to describe the CFI as only a single expression is required for
> - the whole vector register, rather than a separate expression for each lane's
> - dword of the vector register. It also allows the compiler to produce DWARF
> - that indexes the vector register if it spills scalar registers into portions
> - of a vector registers.
> -
> - Since DWARF stack value entries have a base type and AMDGPU registers are a
> - vector of dwords, the ability to specify that a base type is a vector is
> - required. See ``DW_AT_LLVM_vector_size``.
> -
> - If the source language is mapped onto the AMDGPU wavefronts in a SIMT manner,
> - then the variable DWARF location expressions must compute the location for a
> - single lane of the wavefront. Therefore, a DWARF operation is required to
> - denote the current lane, much like ``DW_OP_push_object_address`` denotes the
> - current object. The ``DW_OP_*piece`` operations only allow literal indices.
> - Therefore, a way to use a computed offset of an arbitrary location description
> - (such as a vector register) is required. See ``DW_OP_LLVM_push_lane``,
> - ``DW_OP_LLVM_offset``, ``DW_OP_LLVM_offset_constu``, and
> - ``DW_OP_LLVM_bit_offset``.
> -
> - If the source language is mapped onto the AMDGPU wavefronts in a SIMT manner
> - the compiler can use the AMDGPU execution mask register to control which lanes
> - are active. To describe the conceptual location of non-active lanes a DWARF
> - expression is needed that can compute a per lane PC. For efficiency, this is
> - done for the wavefront as a whole. This expression benefits by having a masked
> - select composite location description operation. This requires an attribute
> - for source location of each lane. The AMDGPU may update the execution mask for
> - whole wavefront operations and so needs an attribute that computes the current
> - active lane mask. See ``DW_OP_LLVM_select_bit_piece``, ``DW_OP_LLVM_extend``,
> - ``DW_AT_LLVM_lane_pc``, and ``DW_AT_LLVM_active_lane``.
> -
> - AMDGPU needs to be able to describe addresses that are in
> diff erent kinds of
> - memory. Optimized code may need to describe a variable that resides in pieces
> - that are in
> diff erent kinds of storage which may include parts of registers,
> - memory that is in a mixture of memory kinds, implicit values, or be undefined.
> - DWARF has the concept of segment addresses. However, the segment cannot be
> - specified within a DWARF expression, which is only able to specify the offset
> - portion of a segment address. The segment index is only provided by the entity
> - that specifies the DWARF expression. Therefore, the segment index is a
> - property that can only be put on complete objects, such as a variable. That
> - makes it only suitable for describing an entity (such as variable or
> - subprogram code) that is in a single kind of memory. Therefore, AMDGPU uses
> - the DWARF concept of address spaces. For example, a variable may be allocated
> - in a register that is partially spilled to the call stack which is in the
> - private address space, and partially spilled to the local address space.
> -
> - DWARF uses the concept of an address in many expression operations but does not
> - define how it relates to address spaces. For example,
> - ``DW_OP_push_object_address`` pushes the address of an object. Other contexts
> - implicitly push an address on the stack before evaluating an expression. For
> - example, the ``DW_AT_use_location`` attribute of the
> - ``DW_TAG_ptr_to_member_type``. The expression that uses the address needs to
> - do so in a general way and not need to be dependent on the address space of
> - the address. For example, a pointer to member value may want to be applied to
> - an object that may reside in any address space.
> -
> - The number of registers and the cost of memory operations is much higher for
> - AMDGPU than a typical CPU. The compiler attempts to optimize whole variables
> - and arrays into registers. Currently DWARF only allows
> - ``DW_OP_push_object_address`` and related operations to work with a global
> - memory location. To support AMDGPU optimized code it is required to generalize
> - DWARF to allow any location description to be used. This allows registers, or
> - composite location descriptions that may be a mixture of memory, registers, or
> - even implicit values.
> -
> - DWARF Version 5 does not allow location descriptions to be entries on the
> - DWARF stack. They can only be the final result of the evaluation of a DWARF
> - expression. However, by allowing a location description to be a first-class
> - entry on the DWARF stack it becomes possible to compose expressions containing
> - both values and location descriptions naturally. It allows objects to be
> - located in any kind of memory address space, in registers, be implicit values,
> - be undefined, or a composite of any of these. By extending DWARF carefully,
> - all existing DWARF expressions can retain their current semantic meaning.
> - DWARF has implicit conversions that convert from a value that represents an
> - address in the default address space to a memory location description. This
> - can be extended to allow a default address space memory location description
> - to be implicitly converted back to its address value. This allows all DWARF
> - Version 5 expressions to retain their same meaning, while adding the ability
> - to explicitly create memory location descriptions in non-default address
> - spaces and generalizing the power of composite location descriptions to any
> - kind of location description. See :ref:`amdgpu-dwarf-operation-expressions`.
> -
> - To allow composition of composite location descriptions, an explicit operation
> - that indicates the end of the definition of a composite location description
> - is required. This can be implied if the end of a DWARF expression is reached,
> - allowing current DWARF expressions to remain legal. See
> - ``DW_OP_LLVM_piece_end``.
> -
> - The ``DW_OP_plus`` and ``DW_OP_minus`` can be defined to operate on a memory
> - location description in the default target architecture specific address space
> - and a generic type value to produce an updated memory location description.
> - This allows them to continue to be used to offset an address. To generalize
> - offsetting to any location description, including location descriptions that
> - describe when bytes are in registers, are implicit, or a composite of these,
> - the ``DW_OP_LLVM_offset``, ``DW_OP_LLVM_offset_constu`` and
> - ``DW_OP_LLVM_bit_offset`` operations are added. These do not perform wrapping
> - which would be hard to define for location descriptions of non-memory kinds.
> - This allows ``DW_OP_push_object_address`` to push a location description that
> - may be in a register, or be an implicit value, and the DWARF expression of
> - ``DW_TAG_ptr_to_member_type`` can contain ``DW_OP_LLVM_offset`` to offset
> - within it. ``DW_OP_LLVM_bit_offset`` generalizes DWARF to work with bit fields
> - which is not possible in DWARF Version 5.
> -
> - The DWARF ``DW_OP_xderef*`` operations allow a value to be converted into an
> - address of a specified address space which is then read. But it provides no
> - way to create a memory location description for an address in the non-default
> - address space. For example, AMDGPU variables can be allocated in the local
> - address space at a fixed address. It is required to have an operation to
> - create an address in a specific address space that can be used to define the
> - location description of the variable. Defining this operation to produce a
> - location description allows the size of addresses in an address space to be
> - larger than the generic type. See ``DW_OP_LLVM_form_aspace_address``.
> -
> - If the ``DW_OP_LLVM_form_aspace_address`` operation had to produce a value
> - that can be implicitly converted to a memory location description, then it
> - would be limited to the size of the generic type which matches the size of the
> - default address space. Its value would be unspecified and likely not match any
> - value in the actual program. By making the result a location description, it
> - allows a consumer great freedom in how it implements it. The implicit
> - conversion back to a value can be limited only to the default address space to
> - maintain compatibility with DWARF Version 5. For other address spaces the
> - producer can use the new operations that explicitly specify the address space.
> -
> - ``DW_OP_breg*`` treats the register as containing an address in the default
> - address space. It is required to be able to specify the address space of the
> - register value. See ``DW_OP_LLVM_aspace_bregx``.
> -
> - Similarly, ``DW_OP_implicit_pointer`` treats its implicit pointer value as
> - being in the default address space. It is required to be able to specify the
> - address space of the pointer value. See
> - ``DW_OP_LLVM_aspace_implicit_pointer``.
> -
> - Almost all uses of addresses in DWARF are limited to defining location
> - descriptions, or to be dereferenced to read memory. The exception is
> - ``DW_CFA_val_offset`` which uses the address to set the value of a register.
> - By defining the CFA DWARF expression as being a memory location description,
> - it can maintain what address space it is, and that can be used to convert the
> - offset address back to an address in that address space. See
> - :ref:`amdgpu-dwarf-call-frame-information`.
> -
> - This approach allows all existing DWARF to have the identical semantics. It
> - allows the compiler to explicitly specify the address space it is using. For
> - example, a compiler could choose to access private memory in a swizzled manner
> - when mapping a source language to a wavefront in a SIMT manner, or to access
> - it in an unswizzled manner if mapping the same language with the wavefront
> - being the thread. It also allows the compiler to mix the address space it uses
> - to access private memory. For example, for SIMT it can still spill entire
> - vector registers in an unswizzled manner, while using a swizzled private
> - memory for SIMT variable access. This approach allows memory location
> - descriptions for
> diff erent address spaces to be combined using the regular
> - ``DW_OP_*piece`` operations.
> -
> - Location descriptions are an abstraction of storage, they give freedom to the
> - consumer on how to implement them. They allow the address space to encode lane
> - information so they can be used to read memory with only the memory
> - description and no extra arguments. The same set of operations can operate on
> - locations independent of their kind of storage. The ``DW_OP_deref*`` therefore
> - can be used on any storage kind. ``DW_OP_xderef*`` is unnecessary except to
> - become a more compact way to convert a non-default address space address
> - followed by dereferencing it.
> -
> - In DWARF Version 5 a location description is defined as a single location
> - description or a location list. A location list is defined as either
> - effectively an undefined location description or as one or more single
> - location descriptions to describe an object with multiple places. The
> - ``DW_OP_push_object_address`` and ``DW_OP_call*`` operations can put a
> - location description on the stack. Furthermore, debugger information entry
> - attributes such as ``DW_AT_data_member_location``, ``DW_AT_use_location``, and
> - ``DW_AT_vtable_elem_location`` are defined as pushing a location description
> - on the expression stack before evaluating the expression. However, DWARF
> - Version 5 only allows the stack to contain values and so only a single memory
> - address can be on the stack which makes these incapable of handling location
> - descriptions with multiple places, or places other than memory. Since this
> - proposal allows the stack to contain location descriptions, the operations are
> - generalized to support location descriptions that can have multiple places.
> - This is backwards compatible with DWARF Version 5 and allows objects with
> - multiple places to be supported. For example, the expression that describes
> - how to access the field of an object can be evaluated with a location
> - description that has multiple places and will result in a location description
> - with multiple places as expected. With this change, the separate DWARF Version
> - 5 sections that described DWARF expressions and location lists have been
> - unified into a single section that describes DWARF expressions in general.
> - This unification seems to be a natural consequence and a necessity of allowing
> - location descriptions to be part of the evaluation stack.
> -
> - For those familiar with the definition of location descriptions in DWARF
> - Version 5, the definition in this proposal is presented
> diff erently, but does
> - in fact define the same concept with the same fundamental semantics. However,
> - it does so in a way that allows the concept to extend to support address
> - spaces, bit addressing, the ability for composite location descriptions to be
> - composed of any kind of location description, and the ability to support
> - objects located at multiple places. Collectively these changes expand the set
> - of processors that can be supported and improves support for optimized code.
> -
> - Several approaches were considered, and the one proposed appears to be the
> - cleanest and offers the greatest improvement of DWARF's ability to support
> - optimized code. Examining the gdb debugger and LLVM compiler, it appears only
> - to require modest changes as they both already have to support general use of
> - location descriptions. It is anticipated that will also be the case for other
> - debuggers and compilers.
> -
> - As an experiment, gdb was modified to evaluate DWARF Version 5 expressions
> - with location descriptions as stack entries and implicit conversions. All gdb
> - tests have passed, except one that turned out to be an invalid test by DWARF
> - Version 5 rules. The code in gdb actually became simpler as all evaluation was
> - on the stack and there was no longer a need to maintain a separate structure
> - for the location description result. This gives confidence of the backwards
> - compatibility.
> -
> - Since the AMDGPU supports languages such as OpenCL, there is a need to define
> - source language address classes so they can be used in a consistent way by
> - consumers. It would also be desirable to add support for using them in
> - defining language types rather than the current target architecture specific
> - address spaces. See :ref:`amdgpu-dwarf-segment_addresses`.
> -
> - A ``DW_AT_LLVM_augmentation`` attribute is added to a compilation unit
> - debugger information entry to indicate that there is additional target
> - architecture specific information in the debugging information entries of that
> - compilation unit. This allows a consumer to know what extensions are present
> - in the debugger information entries as is possible with the augmentation
> - string of other sections. The format that should be used for the augmentation
> - string in the lookup by name table and CFI Common Information Entry is also
> - recommended to allow a consumer to parse the string when it contains
> - information from multiple vendors.
> -
> - The AMDGPU supports programming languages that include online compilation
> - where the source text may be created at runtime. Therefore, a way to embed the
> - source text in the debug information is required. For example, the OpenCL
> - language runtime supports online compilation. See
> - :ref:`amdgpu-dwarf-line-number-information`.
> -
> - Support to allow MD5 checksums to be optionally present in the line table is
> - added. This allows linking together compilation units where some have MD5
> - checksums and some do not. In DWARF Version 5 the file timestamp and file size
> - can be optional, but if the MD5 checksum is present it must be valid for all
> - files. See :ref:`amdgpu-dwarf-line-number-information`.
> -
> - Support is added for the HIP programming language which is supported by the
> - AMDGPU. See :ref:`amdgpu-dwarf-language-names`.
> -
> - The following sections provide the definitions for the additional operations,
> - as well as clarifying how existing expression operations, CFI operations, and
> - attributes behave with respect to generalized location descriptions that
> - support address spaces and location descriptions that support multiple places.
> - It has been defined such that it is backwards compatible with DWARF Version 5.
> - The definitions are intended to fully define well-formed DWARF in a consistent
> - style based on the DWARF Version 5 specification. Non-normative text is shown
> - in *italics*.
> -
> - The names for the new operations, attributes, and constants include "\
> - ``LLVM``\ " and are encoded with vendor specific codes so this proposal can be
> - implemented as an LLVM vendor extension to DWARF Version 5. If accepted these
> - names would not include the "\ ``LLVM``\ " and would not use encodings in the
> - vendor range.
> -
> - The proposal is organized to follow the section ordering of DWARF Version 5.
> - It includes notes to indicate the corresponding DWARF Version 5 sections to
> - which they pertain. Other notes describe additional changes that may be worth
> - considering, and to raise questions.
> -
> -General Description
> --------------------
> -
> -Attribute Types
> -~~~~~~~~~~~~~~~
> -
> -.. note::
> -
> - This augments DWARF Version 5 section 2.2 and Table 2.2.
> -
> -The following table provides the additional attributes. See
> -:ref:`amdgpu-dwarf-debugging-information-entry-attributes`.
> -
> -.. table:: Attribute names
> - :name: amdgpu-dwarf-attribute-names-table
> -
> - =========================== ====================================
> - Attribute Usage
> - =========================== ====================================
> - ``DW_AT_LLVM_active_lane`` SIMD or SIMT active lanes
> - ``DW_AT_LLVM_augmentation`` Compilation unit augmentation string
> - ``DW_AT_LLVM_lane_pc`` SIMD or SIMT lane program location
> - ``DW_AT_LLVM_lanes`` SIMD or SIMT thread lane count
> - ``DW_AT_LLVM_vector_size`` Base type vector size
> - =========================== ====================================
> -
> -.. _amdgpu-dwarf-expressions:
> -
> -DWARF Expressions
> -~~~~~~~~~~~~~~~~~
> -
> -.. note::
> -
> - This section, and its nested sections, replaces DWARF Version 5 section 2.5 and
> - section 2.6. The new proposed DWARF expression operations are defined as well
> - as clarifying the extensions to already existing DWARF Version 5 operations. It is
> - based on the text of the existing DWARF Version 5 standard.
> -
> -DWARF expressions describe how to compute a value or specify a location.
> -
> -*The evaluation of a DWARF expression can provide the location of an object, the
> -value of an array bound, the length of a dynamic string, the desired value
> -itself, and so on.*
> -
> -The evaluation of a DWARF expression can either result in a value or a location
> -description:
> -
> -*value*
> -
> - A value has a type and a literal value. It can represent a literal value of
> - any supported base type of the target architecture. The base type specifies
> - the size and encoding of the literal value.
> -
> - .. note::
> -
> - It may be desirable to add an implicit pointer base type encoding. It would
> - be used for the type of the value that is produced when the ``DW_OP_deref*``
> - operation retrieves the full contents of an implicit pointer location
> - storage created by the ``DW_OP_implicit_pointer`` or
> - ``DW_OP_LLVM_aspace_implicit_pointer`` operations. The literal value would
> - record the debugging information entry and byte dispacement specified by the
> - associated ``DW_OP_implicit_pointer`` or
> - ``DW_OP_LLVM_aspace_implicit_pointer`` operations.
> -
> - Instead of a base type, a value can have a distinguished generic type, which
> - is an integral type that has the size of an address in the target architecture
> - default address space and unspecified signedness.
> -
> - *The generic type is the same as the unspecified type used for stack
> - operations defined in DWARF Version 4 and before.*
> -
> - An integral type is a base type that has an encoding of ``DW_ATE_signed``,
> - ``DW_ATE_signed_char``, ``DW_ATE_unsigned``, ``DW_ATE_unsigned_char``,
> - ``DW_ATE_boolean``, or any target architecture defined integral encoding in
> - the inclusive range ``DW_ATE_lo_user`` to ``DW_ATE_hi_user``.
> -
> - .. note::
> -
> - It is unclear if ``DW_ATE_address`` is an integral type. Gdb does not seem
> - to consider it as integral.
> -
> -*location description*
> -
> - *Debugging information must provide consumers a way to find the location of
> - program variables, determine the bounds of dynamic arrays and strings, and
> - possibly to find the base address of a subprogram’s stack frame or the return
> - address of a subprogram. Furthermore, to meet the needs of recent computer
> - architectures and optimization techniques, debugging information must be able
> - to describe the location of an object whose location changes over the object’s
> - lifetime, and may reside at multiple locations simultaneously during parts of
> - an object's lifetime.*
> -
> - Information about the location of program objects is provided by location
> - descriptions.
> -
> - Location descriptions can consist of one or more single location descriptions.
> -
> - A single location description specifies the location storage that holds a
> - program object and a position within the location storage where the program
> - object starts. The position within the location storage is expressed as a bit
> - offset relative to the start of the location storage.
> -
> - A location storage is a linear stream of bits that can hold values. Each
> - location storage has a size in bits and can be accessed using a zero-based bit
> - offset. The ordering of bits within a location storage uses the bit numbering
> - and direction conventions that are appropriate to the current language on the
> - target architecture.
> -
> - There are five kinds of location storage:
> -
> - *memory location storage*
> - Corresponds to the target architecture memory address spaces.
> -
> - *register location storage*
> - Corresponds to the target architecture registers.
> -
> - *implicit location storage*
> - Corresponds to fixed values that can only be read.
> -
> - *undefined location storage*
> - Indicates no value is available and therefore cannot be read or written.
> -
> - *composite location storage*
> - Allows a mixture of these where some bits come from one location storage and
> - some from another location storage, or from disjoint parts of the same
> - location storage.
> -
> - .. note::
> -
> - It may be better to add an implicit pointer location storage kind used by
> - the ``DW_OP_implicit_pointer`` and ``DW_OP_LLVM_aspace_implicit_pointer``
> - operations. It would specify the debugger information entry and byte offset
> - provided by the operations.
> -
> - *Location descriptions are a language independent representation of addressing
> - rules. They are created using DWARF operation expressions of arbitrary
> - complexity. They can be the result of evaluting a debugger information entry
> - attribute that specifies an operation expression. In this usage they can
> - describe the location of an object as long as its lifetime is either static or
> - the same as the lexical block (see DWARF Version 5 section 3.5) that owns it,
> - and it does not move during its lifetime. They can be the result of evaluating
> - a debugger information entry attribute that specifies a location list
> - expression. In this usage they can describe the location of an object that has
> - a limited lifetime, changes its location during its lifetime, or has multiple
> - locations over part or all of its lifetime.*
> -
> - If a location description has more than one single location description, the
> - DWARF expression is ill-formed if the object value held in each single
> - location description's position within the associated location storage is not
> - the same value, except for the parts of the value that are uninitialized.
> -
> - *A location description that has more than one single location description can
> - only be created by a location list expression that has overlapping program
> - location ranges, or certain expression operations that act on a location
> - description that has more than one single location description. There are no
> - operation expression operations that can directly create a location
> - description with more than one single location description.*
> -
> - *A location description with more than one single location description can be
> - used to describe objects that reside in more than one piece of storage at the
> - same time. An object may have more than one location as a result of
> - optimization. For example, a value that is only read may be promoted from
> - memory to a register for some region of code, but later code may revert to
> - reading the value from memory as the register may be used for other purposes.
> - For the code region where the value is in a register, any change to the object
> - value must be made in both the register and the memory so both regions of code
> - will read the updated value.*
> -
> - *A consumer of a location description with more than one single location
> - description can read the object's value from any of the single location
> - descriptions (since they all refer to location storage that has the same
> - value), but must write any changed value to all the single location
> - descriptions.*
> -
> -A DWARF expression can either be encoded as a operation expression (see
> -:ref:`amdgpu-dwarf-operation-expressions`), or as a location list expression
> -(see :ref:`amdgpu-dwarf-location-list-expressions`).
> -
> -A DWARF expression is evaluated in the context of:
> -
> -*A current subprogram*
> - This may be used in the evaluation of register access operations to support
> - virtual unwinding of the call stack (see
> - :ref:`amdgpu-dwarf-call-frame-information`).
> -
> -*A current program location*
> - This may be used in the evaluation of location list expressions to select
> - amongst multiple program location ranges. It should be the program location
> - corresponding to the current subprogram. If the current subprogram was reached
> - by virtual call stack unwinding, then the program location will correspond to
> - the associated call site.
> -
> -*An initial stack*
> - This is a list of values or location descriptions that will be pushed on the
> - operation expression evaluation stack in the order provided before evaluation
> - of an operation expression starts.
> -
> - Some debugger information entries have attributes that evaluate their DWARF
> - expression value with initial stack entries. In all other cases the initial
> - stack is empty.
> -
> -When a DWARF expression is evaluated, it may be specified whether a value or
> -location description is required as the result kind.
> -
> -If a result kind is specified, and the result of the evaluation does not match
> -the specified result kind, then the implicit conversions described in
> -:ref:`amdgpu-dwarf-memory-location-description-operations` are performed if
> -valid. Otherwise, the DWARF expression is ill-formed.
> -
> -.. _amdgpu-dwarf-operation-expressions:
> -
> -DWARF Operation Expressions
> -+++++++++++++++++++++++++++
> -
> -An operation expression is comprised of a stream of operations, each consisting
> -of an opcode followed by zero or more operands. The number of operands is
> -implied by the opcode.
> -
> -Operations represent a postfix operation on a simple stack machine. Each stack
> -entry can hold either a value or a location description. Operations can act on
> -entries on the stack, including adding entries and removing entries. If the kind
> -of a stack entry does not match the kind required by the operation and is not
> -implicitly convertible to the required kind (see
> -:ref:`amdgpu-dwarf-memory-location-description-operations`), then the DWARF
> -operation expression is ill-formed.
> -
> -Evaluation of an operation expression starts with an empty stack on which the
> -entries from the initial stack provided by the context are pushed in the order
> -provided. Then the operations are evaluated, starting with the first operation
> -of the stream, until one past the last operation of the stream is reached. The
> -result of the evaluation is:
> -
> -* If evaluation of the DWARF expression requires a location description, then:
> -
> - * If the stack is empty, the result is a location description with one
> - undefined location description.
> -
> - *This rule is for backwards compatibility with DWARF Version 5 which has no
> - explicit operation to create an undefined location description, and uses an
> - empty operation expression for this purpose.*
> -
> - * If the top stack entry is a location description, or can be converted
> - to one, then the result is that, possibly converted, location description.
> - Any other entries on the stack are discarded.
> -
> - * Otherwise the DWARF expression is ill-formed.
> -
> - .. note::
> -
> - Could define this case as returning an implicit location description as
> - if the ``DW_OP_implicit`` operation is performed.
> -
> -* If evaluation of the DWARF expression requires a value, then:
> -
> - * If the top stack entry is a value, or can be converted to one, then the
> - result is that, possibly converted, value. Any other entries on the stack
> - are discarded.
> -
> - * Otherwise the DWARF expression is ill-formed.
> -
> -* If evaluation of the DWARF expression does not specify if a value or location
> - description is required, then:
> -
> - * If the stack is empty, the result is a location description with one
> - undefined location description.
> -
> - *This rule is for backwards compatibility with DWARF Version 5 which has no
> - explicit operation to create an undefined location description, and uses an
> - empty operation expression for this purpose.*
> -
> - .. note::
> -
> - This rule is consistent with the rule above for when a location
> - description is requested. However, gdb appears to report this as an error
> - and no gdb tests appear to cause an empty stack for this case.
> -
> - * Otherwise, the top stack entry is returned. Any other entries on the stack
> - are discarded.
> -
> -An operation expression is encoded as a byte block with some form of prefix that
> -specifies the byte count. It can be used:
> -
> -* as the value of a debugging information entry attribute that is encoded using
> - class ``exprloc`` (see DWARF Version 5 section 7.5.5),
> -
> -* as the operand to certain operation expression operations,
> -
> -* as the operand to certain call frame information operations (see
> - :ref:`amdgpu-dwarf-call-frame-information`),
> -
> -* and in location list entries (see
> - :ref:`amdgpu-dwarf-location-list-expressions`).
> -
> -.. _amdgpu-dwarf-stack-operations:
> -
> -Stack Operations
> -################
> -
> -The following operations manipulate the DWARF stack. Operations that index the
> -stack assume that the top of the stack (most recently added entry) has index 0.
> -They allow the stack entries to be either a value or location description.
> -
> -If any stack entry accessed by a stack operation is an incomplete composite
> -location description, then the DWARF expression is ill-formed.
> -
> -.. note::
> -
> - These operations now support stack entries that are values and location
> - descriptions.
> -
> -.. note::
> -
> - If it is desired to also make them work with incomplete composite location
> - descriptions, then would need to define that the composite location storage
> - specified by the incomplete composite location description is also replicated
> - when a copy is pushed. This ensures that each copy of the incomplete composite
> - location description can update the composite location storage they specify
> - independently.
> -
> -1. ``DW_OP_dup``
> -
> - ``DW_OP_dup`` duplicates the stack entry at the top of the stack.
> -
> -2. ``DW_OP_drop``
> -
> - ``DW_OP_drop`` pops the stack entry at the top of the stack and discards it.
> -
> -3. ``DW_OP_pick``
> -
> - ``DW_OP_pick`` has a single unsigned 1-byte operand that represents an index
> - I. A copy of the stack entry with index I is pushed onto the stack.
> -
> -4. ``DW_OP_over``
> -
> - ``DW_OP_over`` pushes a copy of the entry with index 1.
> -
> - *This is equivalent to a ``DW_OP_pick 1`` operation.*
> -
> -5. ``DW_OP_swap``
> -
> - ``DW_OP_swap`` swaps the top two stack entries. The entry at the top of the
> - stack becomes the second stack entry, and the second stack entry becomes the
> - top of the stack.
> -
> -6. ``DW_OP_rot``
> -
> - ``DW_OP_rot`` rotates the first three stack entries. The entry at the top of
> - the stack becomes the third stack entry, the second entry becomes the top of
> - the stack, and the third entry becomes the second entry.
> -
> -.. _amdgpu-dwarf-control-flow-operations:
> -
> -Control Flow Operations
> -#######################
> -
> -The following operations provide simple control of the flow of a DWARF operation
> -expression.
> -
> -1. ``DW_OP_nop``
> -
> - ``DW_OP_nop`` is a place holder. It has no effect on the DWARF stack
> - entries.
> -
> -2. ``DW_OP_le``, ``DW_OP_ge``, ``DW_OP_eq``, ``DW_OP_lt``, ``DW_OP_gt``,
> - ``DW_OP_ne``
> -
> - .. note::
> -
> - The same as in DWARF Version 5 section 2.5.1.5.
> -
> -3. ``DW_OP_skip``
> -
> - ``DW_OP_skip`` is an unconditional branch. Its single operand is a 2-byte
> - signed integer constant. The 2-byte constant is the number of bytes of the
> - DWARF expression to skip forward or backward from the current operation,
> - beginning after the 2-byte constant.
> -
> - If the updated position is at one past the end of the last operation, then
> - the operation expression evaluation is complete.
> -
> - Otherwise, the DWARF expression is ill-formed if the updated operation
> - position is not in the range of the first to last operation inclusive, or
> - not at the start of an operation.
> -
> -4. ``DW_OP_bra``
> -
> - ``DW_OP_bra`` is a conditional branch. Its single operand is a 2-byte signed
> - integer constant. This operation pops the top of stack. If the value popped
> - is not the constant 0, the 2-byte constant operand is the number of bytes of
> - the DWARF operation expression to skip forward or backward from the current
> - operation, beginning after the 2-byte constant.
> -
> - If the updated position is at one past the end of the last operation, then
> - the operation expression evaluation is complete.
> -
> - Otherwise, the DWARF expression is ill-formed if the updated operation
> - position is not in the range of the first to last operation inclusive, or
> - not at the start of an operation.
> -
> -5. ``DW_OP_call2, DW_OP_call4, DW_OP_call_ref``
> -
> - ``DW_OP_call2``, ``DW_OP_call4``, and ``DW_OP_call_ref`` perform DWARF
> - procedure calls during evaluation of a DWARF expression.
> -
> - ``DW_OP_call2`` and ``DW_OP_call4``, have one operand that is a 2- or 4-byte
> - unsigned offset, respectively, of a debugging information entry D in the
> - current compilation unit.
> -
> - ``DW_OP_LLVM_call_ref`` has one operand that is a 4-byte unsigned value in
> - the 32-bit DWARF format, or an 8-byte unsigned value in the 64-bit DWARF
> - format, that represents an offset of a debugging information entry D in a
> - ``.debug_info`` section, which may be contained in an executable or shared
> - object file other than that containing the operation. For references from one
> - executable or shared object file to another, the relocation must be
> - performed by the consumer.
> -
> - *Operand interpretation of* ``DW_OP_call2``\ *,* ``DW_OP_call4``\ *, and*
> - ``DW_OP_call_ref`` *is exactly like that for* ``DW_FORM_ref2``\ *,
> - ``DW_FORM_ref4``\ *, and* ``DW_FORM_ref_addr``\ *, respectively.*
> -
> - The call operation is evaluated by:
> -
> - * If D has a ``DW_AT_location`` attribute that is encoded as a ``exprloc``
> - that specifies an operation expression E, then execution of the current
> - operation expression continues from the first operation of E. Execution
> - continues until one past the last operation of E is reached, at which
> - point execution continues with the operation following the call operation.
> - Since E is evaluated on the same stack as the call, E can use, add, and/or
> - remove entries already on the stack.
> -
> - *Values on the stack at the time of the call may be used as parameters by
> - the called expression and values left on the stack by the called expression
> - may be used as return values by prior agreement between the calling and
> - called expressions.*
> -
> - * If D has a ``DW_AT_location`` attribute that is encoded as a ``loclist`` or
> - ``loclistsptr``, then the specified location list expression E is
> - evaluated, and the resulting location description is pushed on the stack.
> - The evaluation of E uses a context that has the same current frame and
> - current program location as the current operation expression, but an empty
> - initial stack.
> -
> - .. note::
> -
> - This rule avoids having to define how to execute a matched location list
> - entry operation expression on the same stack as the call when there are
> - multiple matches. But it allows the call to obtain the location
> - description for a variable or formal parameter which may use a location
> - list expression.
> -
> - An alternative is to treat the case when D has a ``DW_AT_location``
> - attribute that is encoded as a ``loclist`` or ``loclistsptr``, and the
> - specified location list expression E' matches a single location list
> - entry with operation expression E, the same as the ``exprloc`` case and
> - evaluate on the same stack.
> -
> - But this is not attractive as if the attribute is for a variable that
> - happens to end with a non-singleton stack, it will not simply put a
> - location description on the stack. Presumably the intent of using
> - ``DW_OP_call*`` on a variable or formal parameter debugger information
> - entry is to push just one location description on the stack. That
> - location description may have more than one single location description.
> -
> - The previous rule for ``exprloc`` also has the same problem as normally
> - a variable or formal parameter location expression may leave multiple
> - entries on the stack and only return the top entry.
> -
> - Gdb implements ``DW_OP_call*`` by always executing E on the same stack.
> - If the location list has multiple matching entries, it simply picks the
> - first one and ignores the rest. This seems fundementally at odds with
> - the desire to supporting multiple places for variables.
> -
> - So, it feels like ``DW_OP_call*`` should both support pushing a location
> - description on the stack for a variable or formal parameter, and also
> - support being able to execute an operation expression on the same stack.
> - Being able to specify a
> diff erent operation expression for
> diff erent
> - program locations seems a desirable feature to retain.
> -
> - A solution to that is to have a distinct ``DW_AT_LLVM_proc`` attribute
> - for the ``DW_TAG_dwarf_procedure`` debugging information entry. Then the
> - ``DW_AT_location`` attribute expression is always executed separately
> - and pushes a location description (that may have multiple single
> - location descriptions), and the ``DW_AT_LLVM_proc`` attribute expression
> - is always executed on the same stack and can leave anything on the
> - stack.
> -
> - The ``DW_AT_LLVM_proc`` attribute could have the new classes
> - ``exprproc``, ``loclistproc``, and ``loclistsptrproc`` to indicate that
> - the expression is executed on the same stack. ``exprproc`` is the same
> - encoding as ``exprloc``. ``loclistproc`` and ``loclistsptrproc`` are the
> - same encoding as their non-\ ``proc`` counterparts except the DWARF is
> - ill-formed if the location list does not match exactly one location list
> - entry and a default entry is required. These forms indicate explicitly
> - that the matched single operation expression must be executed on the
> - same stack. This is better than ad hoc special rules for ``loclistproc``
> - and ``loclistsptrproc`` which are currently clearly defined to always
> - return a location description. The producer then explicitly indicates
> - the intent through the attribute classes.
> -
> - Such a change would be a breaking change for how gdb implements
> - ``DW_OP_call*``. However, are the breaking cases actually occurring in
> - practice? gdb could implement the current approach for DWARF Version 5,
> - and the new semantics for DWARF Version 6 which has been done for some
> - other features.
> -
> - Another option is to limit the execution to be on the same stack only to
> - the evaluation of an expression E that is the value of a
> - ``DW_AT_location`` attribute of a ``DW_TAG_dwarf_procedure`` debugging
> - information entry. The DWARF would be ill-formed if E is a location list
> - expression that does not match exactly one location list entry. In all
> - other cases the evaluation of an expression E that is the value of a
> - ``DW_AT_location`` attribute would evaluate E with a context that has
> - the same current frame and current program location as the current
> - operation expression, but an empty initial stack, and push the resulting
> - location description on the stack.
> -
> - * If D has a ``DW_AT_const_value`` attribute with a value V, then it is as
> - if a ``DW_OP_implicit_value V`` operation was executed.
> -
> - *This allows a call operation to be used to compute the location
> - description for any variable or formal parameter regardless of whether the
> - producer has optimized it to a constant. This is consistent with the
> - ``DW_OP_implicit_pointer`` operation.*
> -
> - .. note::
> -
> - Alternatively, could deprecate using ``DW_AT_const_value`` for
> - ``DW_TAG_variable`` and ``DW_TAG_formal_parameter`` debugger information
> - entries that are constants and instead use ``DW_AT_location`` with an
> - operation expression that results in a location description with one
> - implicit location description. Then this rule would not be required.
> -
> - * Otherwise, there is no effect and no changes are made to the stack.
> -
> - .. note::
> -
> - In DWARF Version 5, if D does not have a ``DW_AT_location`` then
> - ``DW_OP_call*`` is defined to have no effect. It is unclear that this is
> - the right definition as a producer should be able to rely on using
> - ``DW_OP_call*`` to get a location description for any non-\
> - ``DW_TAG_dwarf_procedure`` debugging information entries. Also, the
> - producer should not be creating DWARF with ``DW_OP_call*`` to a
> - ``DW_TAG_dwarf_procedure`` that does not have a ``DW_AT_location``
> - attribute. So, should this case be defined as an ill-formed DWARF
> - expression?
> -
> - *The* ``DW_TAG_dwarf_procedure`` *debugging information entry can be used to
> - define DWARF procedures that can be called.*
> -
> -.. _amdgpu-dwarf-value-operations:
> -
> -Value Operations
> -################
> -
> -This section describes the operations that push values on the stack.
> -
> -Each value stack entry has a type and a literal value and can represent a
> -literal value of any supported base type of the target architecture. The base
> -type specifies the size and encoding of the literal value.
> -
> -Instead of a base type, value stack entries can have a distinguished generic
> -type, which is an integral type that has the size of an address in the target
> -architecture default address space and unspecified signedness.
> -
> -*The generic type is the same as the unspecified type used for stack operations
> -defined in DWARF Version 4 and before.*
> -
> -An integral type is a base type that has an encoding of ``DW_ATE_signed``,
> -``DW_ATE_signed_char``, ``DW_ATE_unsigned``, ``DW_ATE_unsigned_char``,
> -``DW_ATE_boolean``, or any target architecture defined integral encoding in the
> -inclusive range ``DW_ATE_lo_user`` to ``DW_ATE_hi_user``.
> -
> -.. note::
> -
> - Unclear if ``DW_ATE_address`` is an integral type. Gdb does not seem to
> - consider it as integral.
> -
> -.. _amdgpu-dwarf-literal-operations:
> -
> -Literal Operations
> -^^^^^^^^^^^^^^^^^^
> -
> -The following operations all push a literal value onto the DWARF stack.
> -
> -Operations other than ``DW_OP_const_type`` push a value V with the generic type.
> -If V is larger than the generic type, then V is truncated to the generic type
> -size and the low-order bits used.
> -
> -1. ``DW_OP_lit0``, ``DW_OP_lit1``, ..., ``DW_OP_lit31``
> -
> - ``DW_OP_lit<N>`` operations encode an unsigned literal value N from 0
> - through 31, inclusive. They push the value N with the generic type.
> -
> -2. ``DW_OP_const1u``, ``DW_OP_const2u``, ``DW_OP_const4u``, ``DW_OP_const8u``
> -
> - ``DW_OP_const<N>u`` operations have a single operand that is a 1, 2, 4, or
> - 8-byte unsigned integer constant U, respectively. They push the value U with
> - the generic type.
> -
> -3. ``DW_OP_const1s``, ``DW_OP_const2s``, ``DW_OP_const4s``, ``DW_OP_const8s``
> -
> - ``DW_OP_const<N>s`` operations have a single operand that is a 1, 2, 4, or
> - 8-byte signed integer constant S, respectively. They push the value S with
> - the generic type.
> -
> -4. ``DW_OP_constu``
> -
> - ``DW_OP_constu`` has a single unsigned LEB128 integer operand N. It pushes
> - the value N with the generic type.
> -
> -5. ``DW_OP_consts``
> -
> - ``DW_OP_consts`` has a single signed LEB128 integer operand N. It pushes the
> - value N with the generic type.
> -
> -6. ``DW_OP_constx``
> -
> - ``DW_OP_constx`` has a single unsigned LEB128 integer operand that
> - represents a zero-based index into the ``.debug_addr`` section relative to
> - the value of the ``DW_AT_addr_base`` attribute of the associated compilation
> - unit. The value N in the ``.debug_addr`` section has the size of the generic
> - type. It pushes the value N with the generic type.
> -
> - *The* ``DW_OP_constx`` *operation is provided for constants that require
> - link-time relocation but should not be interpreted by the consumer as a
> - relocatable address (for example, offsets to thread-local storage).*
> -
> -9. ``DW_OP_const_type``
> -
> - ``DW_OP_const_type`` has three operands. The first is an unsigned LEB128
> - integer that represents the offset of a debugging information entry D in the
> - current compilation unit, that provides the type of the constant value. The
> - second is a 1-byte unsigned integral constant S. The third is a block of
> - bytes B, with a length equal to S.
> -
> - T is the bit size of the type D. The least significant T bits of B are
> - interpreted as a value V of the type D. It pushes the value V with the type
> - D.
> -
> - The DWARF is ill-formed if D is not a ``DW_TAG_base_type`` debugging
> - information entry, or if T divided by 8 and rounded up to a multiple of 8
> - (the byte size) is not equal to S.
> -
> - *While the size of the byte block B can be inferred from the type D
> - definition, it is encoded explicitly into the operation so that the
> - operation can be parsed easily without reference to the* ``.debug_info``
> - *section.*
> -
> -10. ``DW_OP_LLVM_push_lane`` *New*
> -
> - ``DW_OP_LLVM_push_lane`` pushes a value with the generic type that is the
> - target architecture specific lane identifier of the thread of execution for
> - which a user presented expression is currently being evaluated.
> -
> - *For languages that are implemented using a SIMD or SIMT execution model,
> - this is the lane number that corresponds to the source language thread of
> - execution upon which the user is focused.*
> -
> -.. _amdgpu-dwarf-arithmetic-logical-operations:
> -
> -Arithmetic and Logical Operations
> -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -.. note::
> -
> - This section is the same as DWARF Version 5 section 2.5.1.4.
> -
> -.. _amdgpu-dwarf-type-conversions-operations:
> -
> -Type Conversion Operations
> -^^^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -.. note::
> -
> - This section is the same as DWARF Version 5 section 2.5.1.6.
> -
> -.. _amdgpu-dwarf-general-operations:
> -
> -Special Value Operations
> -^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -There are these special value operations currently defined:
> -
> -1. ``DW_OP_regval_type``
> -
> - ``DW_OP_regval_type`` has two operands. The first is an unsigned LEB128
> - integer that represents a register number R. The second is an unsigned
> - LEB128 integer that represents the offset of a debugging information entry D
> - in the current compilation unit, that provides the type of the register
> - value.
> -
> - The contents of register R are interpreted as a value V of the type D. The
> - value V is pushed on the stack with the type D.
> -
> - The DWARF is ill-formed if D is not a ``DW_TAG_base_type`` debugging
> - information entry, or if the size of type D is not the same as the size of
> - register R.
> -
> - .. note::
> -
> - Should DWARF allow the type D to be a
> diff erent size to the size of the
> - register R? Requiring them to be the same bit size avoids any issue of
> - conversion as the bit contents of the register is simply interpreted as a
> - value of the specified type. If a conversion is wanted it can be done
> - explicitly using a ``DW_OP_convert`` operation.
> -
> - Gdb has a per register hook that allows a target specific conversion on a
> - register by register basis. It defaults to truncation of bigger registers,
> - and to actually reading bytes from the next register (or reads out of
> - bounds for the last register) for smaller registers. There are no gdb
> - tests that read a register out of bounds (except an illegal hand written
> - assembly test).
> -
> -2. ``DW_OP_deref``
> -
> - The ``DW_OP_deref`` operation pops one stack entry that must be a location
> - description L.
> -
> - A value of the bit size of the generic type is retrieved from the location
> - storage specified by L. The value V retrieved is pushed on the stack with
> - the generic type.
> -
> - If any bit of the value is retrieved from the undefined location storage, or
> - the offset of any bit exceeds the size of the location storage specified by
> - L, then the DWARF expression is ill-formed.
> -
> - See :ref:`amdgpu-dwarf-implicit-location-descriptions` for special rules
> - concerning implicit location descriptions created by the
> - ``DW_OP_implicit_pointer`` and ``DW_OP_LLVM_implicit_aspace_pointer``
> - operations.
> -
> - *If L, or the location description of any composite location description
> - part that is a subcomponent of L, has more than one single location
> - description, then any one of them can be selected as they are required to
> - all have the same value. For any single location description SL, bits are
> - retrieved from the associated storage location starting at the bit offset
> - specified by SL. For a composite location description, the retrieved bits
> - are the concatenation of the N bits from each composite location part PL,
> - where N is limited to the size of PL.*
> -
> -3. ``DW_OP_deref_size``
> -
> - ``DW_OP_deref_size`` has a single 1-byte unsigned integral constant that
> - represents a byte result size S.
> -
> - It pops one stack entry that must be a location description L.
> -
> - T is the smaller of the generic type size and S scaled by 8 (the byte size).
> - A value V of T bits is retrieved from the location storage specified by L.
> - If V is smaller than the size of the generic type, V is zero-extended to the
> - generic type size. V is pushed onto the stack with the generic type.
> -
> - The DWARF expression is ill-formed if any bit of the value is retrieved from
> - the undefined location storage, or if the offset of any bit exceeds the size
> - of the location storage specified by L.
> -
> - .. note::
> -
> - Truncating the value when S is larger than the generic type matches what
> - gdb does. This allows the generic type size to not be a integral byte
> - size. It does allow S to be arbitrarily large. Should S be restricted to
> - the size of the generic type rounded up to a multiple of 8?
> -
> - See :ref:`amdgpu-dwarf-implicit-location-descriptions` for special rules
> - concerning implicit location descriptions created by the
> - ``DW_OP_implicit_pointer`` and ``DW_OP_LLVM_implicit_aspace_pointer``
> - operations.
> -
> -4. ``DW_OP_deref_type``
> -
> - ``DW_OP_deref_type`` has two operands. The first is a 1-byte unsigned
> - integral constant S. The second is an unsigned LEB128 integer that
> - represents the offset of a debugging information entry D in the current
> - compilation unit, that provides the type of the result value.
> -
> - It pops one stack entry that must be a location description L. T is the bit
> - size of the type D. A value V of T bits is retrieved from the location
> - storage specified by L. V is pushed on the stack with the type D.
> -
> - The DWARF is ill-formed if D is not a ``DW_TAG_base_type`` debugging
> - information entry, if T divided by 8 and rounded up to a multiple of 8 (the
> - byte size) is not equal to S, if any bit of the value is retrieved from the
> - undefined location storage, or if the offset of any bit exceeds the size of
> - the location storage specified by L.
> -
> - See :ref:`amdgpu-dwarf-implicit-location-descriptions` for special rules
> - concerning implicit location descriptions created by the
> - ``DW_OP_implicit_pointer`` and ``DW_OP_LLVM_implicit_aspace_pointer``
> - operations.
> -
> - *While the size of the pushed value V can be inferred from the type D
> - definition, it is encoded explicitly into the operation so that the
> - operation can be parsed easily without reference to the* ``.debug_info``
> - *section.*
> -
> - .. note::
> -
> - It is unclear why the operand S is needed. Unlike ``DW_OP_const_type``,
> - the size is not needed for parsing. Any evaluation needs to get the base
> - type to record with the value to know its encoding and bit size.
> -
> - This definition allows the base type to be a bit size since there seems no
> - reason to restrict it.
> -
> -5. ``DW_OP_xderef`` *Deprecated*
> -
> - ``DW_OP_xderef`` pops two stack entries. The first must be an integral type
> - value that represents an address A. The second must be an integral type
> - value that represents a target architecture specific address space
> - identifier AS.
> -
> - The operation is equivalent to performing ``DW_OP_swap;
> - DW_OP_LLVM_form_aspace_address; DW_OP_deref``. The value V retrieved is left
> - on the stack with the generic type.
> -
> - *This operation is deprecated as the* ``DW_OP_LLVM_form_aspace_address``
> - operation can be used and provides greater expressiveness.*
> -
> -6. ``DW_OP_xderef_size`` *Deprecated*
> -
> - ``DW_OP_xderef_size`` has a single 1-byte unsigned integral constant that
> - represents a byte result size S.
> -
> - It pops two stack entries. The first must be an integral type value that
> - represents an address A. The second must be an integral type value that
> - represents a target architecture specific address space identifier AS.
> -
> - The operation is equivalent to performing ``DW_OP_swap;
> - DW_OP_LLVM_form_aspace_address; DW_OP_deref_size S``. The zero-extended
> - value V retrieved is left on the stack with the generic type.
> -
> - *This operation is deprecated as the* ``DW_OP_LLVM_form_aspace_address``
> - operation can be used and provides greater expressiveness.*
> -
> -7. ``DW_OP_xderef_type`` *Deprecated*
> -
> - ``DW_OP_xderef_type`` has two operands. The first is a 1-byte unsigned
> - integral constant S. The second operand is an unsigned LEB128
> - integer R that represents the offset of a debugging information entry D in
> - the current compilation unit, that provides the type of the result value.
> -
> - It pops two stack entries. The first must be an integral type value that
> - represents an address A. The second must be an integral type value that
> - represents a target architecture specific address space identifier AS.
> -
> - The operation is equivalent to performing ``DW_OP_swap;
> - DW_OP_LLVM_form_aspace_address; DW_OP_deref_type S R``. The value V
> - retrieved is left on the stack with the type D.
> -
> - *This operation is deprecated as the* ``DW_OP_LLVM_form_aspace_address``
> - operation can be used and provides greater expressiveness.*
> -
> -8. ``DW_OP_entry_value`` *Deprecated*
> -
> - ``DW_OP_entry_value`` pushes the value that the described location held upon
> - entering the current subprogram.
> -
> - It has two operands. The first is an unsigned LEB128 integer S. The second
> - is a block of bytes, with a length equal S, interpreted as a DWARF
> - operation expression E.
> -
> - E is evaluated as if it had been evaluated upon entering the current
> - subprogram with an empty initial stack.
> -
> - .. note::
> -
> - It is unclear what this means. What is the current program location and
> - current frame that must be used? Does this require reverse execution so
> - the register and memory state are as it was on entry to the current
> - subprogram?
> -
> - The DWARF expression is ill-formed if the evaluation of E executes a
> - ``DW_OP_push_object_address`` operation.
> -
> - If the result of E is a location description with one register location
> - description (see :ref:`amdgpu-dwarf-register-location-descriptions`),
> - ``DW_OP_entry_value`` pushes the value that register had upon entering the
> - current subprogram. The value entry type is the target architecture register
> - base type. If the register value is undefined or the register location
> - description bit offset is not 0, then the DWARF expression is ill-formed.
> -
> - *The register location description provides a more compact form for the case
> - where the value was in a register on entry to the subprogram.*
> -
> - If the result of E is a value V, ``DW_OP_entry_value`` pushes V on the
> - stack.
> -
> - Otherwise, the DWARF expression is ill-formed.
> -
> - *The values needed to evaluate* ``DW_OP_entry_value`` *could be obtained in
> - several ways. The consumer could suspend execution on entry to the
> - subprogram, record values needed by* ``DW_OP_entry_value`` *expressions
> - within the subprogram, and then continue. When evaluating*
> - ``DW_OP_entry_value``\ *, the consumer would use these recorded values
> - rather than the current values. Or, when evaluating* ``DW_OP_entry_value``\
> - *, the consumer could virtually unwind using the Call Frame Information
> - (see* :ref:`amdgpu-dwarf-call-frame-information`\ *) to recover register
> - values that might have been clobbered since the subprogram entry point.*
> -
> - *The* ``DW_OP_entry_value`` *operation is deprecated as its main usage is
> - provided by other means. DWARF Version 5 added the*
> - ``DW_TAG_call_site_parameter`` *debugger information entry for call sites
> - that has* ``DW_AT_call_value``\ *,* ``DW_AT_call_data_location``\ *, and*
> - ``DW_AT_call_data_value`` *attributes that provide DWARF expressions to
> - compute actual parameter values at the time of the call, and requires the
> - producer to ensure the expressions are valid to evaluate even when virtually
> - unwound. The* ``DW_OP_LLVM_call_frame_entry_reg`` *operation provides access
> - to registers in the virtually unwound calling frame.*
> -
> - .. note::
> -
> - It is unclear why this operation is defined this way. How would a consumer
> - know what values have to be saved on entry to the subprogram? Does it have
> - to parse every expression of every ``DW_OP_entry_value`` operation to
> - capture all the possible results needed? Or does it have to implement
> - reverse execution so it can evaluate the expression in the context of the
> - entry of the subprogram so it can obtain the entry point register and
> - memory values? Or does the compiler somehow instruct the consumer how to
> - create the saved copies of the variables on entry?
> -
> - If the expression is simply using existing variables, then it is just a
> - regular expression and no special operation is needed. If the main purpose
> - is only to read the entry value of a register using CFI then it would be
> - better to have an operation that explicitly does just that such as the
> - proposed ``DW_OP_LLVM_call_frame_entry_reg`` operation.
> -
> - Gdb only seems to implement ``DW_OP_entry_value`` when E is exactly
> - ``DW_OP_reg*`` or ``DW_OP_breg*; DW_OP_deref*``. It evaluates E in the
> - context of the calling subprogram and the calling call site program
> - location. But the wording suggests that is not the intention.
> -
> - Given these issues it is suggested ``DW_OP_entry_value`` is deprecated in
> - favor of using the new facities that have well defined semantics and
> - implementations.
> -
> -.. _amdgpu-dwarf-location-description-operations:
> -
> -Location Description Operations
> -###############################
> -
> -This section describes the operations that push location descriptions on the
> -stack.
> -
> -General Location Description Operations
> -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -1. ``DW_OP_LLVM_offset`` *New*
> -
> - ``DW_OP_LLVM_offset`` pops two stack entries. The first must be an integral
> - type value that represents a byte displacement B. The second must be a
> - location description L.
> -
> - It adds the value of B scaled by 8 (the byte size) to the bit offset of each
> - single location description SL of L, and pushes the updated L.
> -
> - If the updated bit offset of any SL is less than 0 or greater than or equal
> - to the size of the location storage specified by SL, then the DWARF
> - expression is ill-formed.
> -
> -2. ``DW_OP_LLVM_offset_constu`` *New*
> -
> - ``DW_OP_LLVM_offset_constu`` has a single unsigned LEB128 integer operand
> - that represents a byte displacement B.
> -
> - The operation is equivalent to performing ``DW_OP_constu B;
> - DW_OP_LLVM_offset``.
> -
> - *This operation is supplied specifically to be able to encode more field
> - displacements in two bytes than can be done with* ``DW_OP_lit*;
> - DW_OP_LLVM_offset``\ *.*
> -
> -3. ``DW_OP_LLVM_bit_offset`` *New*
> -
> - ``DW_OP_LLVM_bit_offset`` pops two stack entries. The first must be an
> - integral type value that represents a bit displacement B. The second must be
> - a location description L.
> -
> - It adds the value of B to the bit offset of each single location description
> - SL of L, and pushes the updated L.
> -
> - If the updated bit offset of any SL is less than 0 or greater than or equal
> - to the size of the location storage specified by SL, then the DWARF
> - expression is ill-formed.
> -
> -4. ``DW_OP_push_object_address``
> -
> - ``DW_OP_push_object_address`` pushes the location description L of the
> - object currently being evaluated as part of evaluation of a user presented
> - expression.
> -
> - This object may correspond to an independent variable described by its own
> - debugging information entry or it may be a component of an array, structure,
> - or class whose address has been dynamically determined by an earlier step
> - during user expression evaluation.
> -
> - *This operation provides explicit functionality (especially for arrays
> - involving descriptions) that is analogous to the implicit push of the base
> - location description of a structure prior to evaluation of a
> - ``DW_AT_data_member_location`` to access a data member of a structure.*
> -
> -5. ``DW_OP_LLVM_call_frame_entry_reg`` *New*
> -
> - ``DW_OP_LLVM_call_frame_entry_reg`` has a single unsigned LEB128 integer
> - operand that represents a target architecture register number R.
> -
> - It pushes a location description L that holds the value of register R on
> - entry to the current subprogram as defined by the Call Frame Information
> - (see :ref:`amdgpu-dwarf-call-frame-information`).
> -
> - *If there is no Call Frame Information defined, then the default rules for
> - the target architecture are used. If the register rule is* undefined\ *, then
> - the undefined location description is pushed. If the register rule is* same
> - value\ *, then a register location description for R is pushed.*
> -
> -Undefined Location Description Operations
> -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -*The undefined location storage represents a piece or all of an object that is
> -present in the source but not in the object code (perhaps due to optimization).
> -Neither reading nor writing to the undefined location storage is meaningful.*
> -
> -An undefined location description specifies the undefined location storage.
> -There is no concept of the size of the undefined location storage, nor of a bit
> -offset for an undefined location description. The ``DW_OP_LLVM_*offset``
> -operations leave an undefined location description unchanged. The
> -``DW_OP_*piece`` operations can explicitly or implicitly specify an undefined
> -location description, allowing any size and offset to be specified, and results
> -in a part with all undefined bits.
> -
> -1. ``DW_OP_LLVM_undefined`` *New*
> -
> - ``DW_OP_LLVM_undefined`` pushes a location description L that comprises one
> - undefined location description SL.
> -
> -.. _amdgpu-dwarf-memory-location-description-operations:
> -
> -Memory Location Description Operations
> -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -Each of the target architecture specific address spaces has a corresponding
> -memory location storage that denotes the linear addressable memory of that
> -address space. The size of each memory location storage corresponds to the range
> -of the addresses in the corresponding address space.
> -
> -*It is target architecture defined how address space location storage maps to
> -target architecture physical memory. For example, they may be independent
> -memory, or more than one location storage may alias the same physical memory
> -possibly at
> diff erent offsets and with
> diff erent interleaving. The mapping may
> -also be dictated by the source language address classes.*
> -
> -A memory location description specifies a memory location storage. The bit
> -offset corresponds to a bit position within a byte of the memory. Bits accessed
> -using a memory location description, access the corresponding target
> -architecture memory starting at the bit position within the byte specified by
> -the bit offset.
> -
> -A memory location description that has a bit offset that is a multiple of 8 (the
> -byte size) is defined to be a byte address memory location description. It has a
> -memory byte address A that is equal to the bit offset divided by 8.
> -
> -A memory location description that does not have a bit offset that is a multiple
> -of 8 (the byte size) is defined to be a bit field memory location description.
> -It has a bit position B equal to the bit offset modulo 8, and a memory byte
> -address A equal to the bit offset minus B that is then divided by 8.
> -
> -The address space AS of a memory location description is defined to be the
> -address space that corresponds to the memory location storage associated with
> -the memory location description.
> -
> -A location description that is comprised of one byte address memory location
> -description SL is defined to be a memory byte address location description. It
> -has a byte address equal to A and an address space equal to AS of the
> -corresponding SL.
> -
> -``DW_ASPACE_none`` is defined as the target architecture default address space.
> -
> -If a stack entry is required to be a location description, but it is a value V
> -with the generic type, then it is implicitly converted to a location description
> -L with one memory location description SL. SL specifies the memory location
> -storage that corresponds to the target architecture default address space with a
> -bit offset equal to V scaled by 8 (the byte size).
> -
> -.. note::
> -
> - If it is wanted to allow any integral type value to be implicitly converted to
> - a memory location description in the target architecture default address
> - space:
> -
> - If a stack entry is required to be a location description, but is a value V
> - with an integral type, then it is implicitly converted to a location
> - description L with a one memory location description SL. If the type size of
> - V is less than the generic type size, then the value V is zero extended to
> - the size of the generic type. The least significant generic type size bits
> - are treated as a twos-complement unsigned value to be used as an address A.
> - SL specifies memory location storage corresponding to the target
> - architecture default address space with a bit offset equal to A scaled by 8
> - (the byte size).
> -
> - The implicit conversion could also be defined as target architecture specific.
> - For example, gdb checks if V is an integral type. If it is not it gives an
> - error. Otherwise, gdb zero-extends V to 64 bits. If the gdb target defines a
> - hook function, then it is called. The target specific hook function can modify
> - the 64-bit value, possibly sign extending based on the original value type.
> - Finally, gdb treats the 64-bit value V as a memory location address.
> -
> -If a stack entry is required to be a location description, but it is an implicit
> -pointer value IPV with the target architecture default address space, then it is
> -implicitly converted to a location description with one single location
> -description specified by IPV. See
> -:ref:`amdgpu-dwarf-implicit-location-descriptions`.
> -
> -.. note::
> -
> - Is this rule required for DWARF Version 5 backwards compatibility? If not, it
> - can be eliminated, and the producer can use
> - ``DW_OP_LLVM_form_aspace_address``.
> -
> -If a stack entry is required to be a value, but it is a location description L
> -with one memory location description SL in the target architecture default
> -address space with a bit offset B that is a multiple of 8, then it is implicitly
> -converted to a value equal to B divided by 8 (the byte size) with the generic
> -type.
> -
> -1. ``DW_OP_addr``
> -
> - ``DW_OP_addr`` has a single byte constant value operand, which has the size
> - of the generic type, that represents an address A.
> -
> - It pushes a location description L with one memory location description SL
> - on the stack. SL specifies the memory location storage corresponding to the
> - target architecture default address space with a bit offset equal to A
> - scaled by 8 (the byte size).
> -
> - *If the DWARF is part of a code object, then A may need to be relocated. For
> - example, in the ELF code object format, A must be adjusted by the
> diff erence
> - between the ELF segment virtual address and the virtual address at which the
> - segment is loaded.*
> -
> -2. ``DW_OP_addrx``
> -
> - ``DW_OP_addrx`` has a single unsigned LEB128 integer operand that represents
> - a zero-based index into the ``.debug_addr`` section relative to the value of
> - the ``DW_AT_addr_base`` attribute of the associated compilation unit. The
> - address value A in the ``.debug_addr`` section has the size of the generic
> - type.
> -
> - It pushes a location description L with one memory location description SL
> - on the stack. SL specifies the memory location storage corresponding to the
> - target architecture default address space with a bit offset equal to A
> - scaled by 8 (the byte size).
> -
> - *If the DWARF is part of a code object, then A may need to be relocated. For
> - example, in the ELF code object format, A must be adjusted by the
> diff erence
> - between the ELF segment virtual address and the virtual address at which the
> - segment is loaded.*
> -
> -3. ``DW_OP_LLVM_form_aspace_address`` *New*
> -
> - ``DW_OP_LLVM_form_aspace_address`` pops top two stack entries. The first
> - must be an integral type value that represents a target architecture
> - specific address space identifier AS. The second must be an integral type
> - value that represents an address A.
> -
> - The address size S is defined as the address bit size of the target
> - architecture specific address space that corresponds to AS.
> -
> - A is adjusted to S bits by zero extending if necessary, and then treating the
> - least significant S bits as a twos-complement unsigned value A'.
> -
> - It pushes a location description L with one memory location description SL
> - on the stack. SL specifies the memory location storage that corresponds to
> - AS with a bit offset equal to A' scaled by 8 (the byte size).
> -
> - The DWARF expression is ill-formed if AS is not one of the values defined by
> - the target architecture specific ``DW_ASPACE_*`` values.
> -
> - See :ref:`amdgpu-dwarf-implicit-location-descriptions` for special rules
> - concerning implicit pointer values produced by dereferencing implicit
> - location descriptions created by the ``DW_OP_implicit_pointer`` and
> - ``DW_OP_LLVM_implicit_aspace_pointer`` operations.
> -
> -4. ``DW_OP_form_tls_address``
> -
> - ``DW_OP_form_tls_address`` pops one stack entry that must be an integral
> - type value and treats it as a thread-local storage address T.
> -
> - It pushes a location description L with one memory location description SL
> - on the stack. SL is the target architecture specific memory location
> - description that corresponds to the thread-local storage address T.
> -
> - The meaning of the thread-local storage address T is defined by the run-time
> - environment. If the run-time environment supports multiple thread-local
> - storage blocks for a single thread, then the block corresponding to the
> - executable or shared library containing this DWARF expression is used.
> -
> - *Some implementations of C, C++, Fortran, and other languages support a
> - thread-local storage class. Variables with this storage class have distinct
> - values and addresses in distinct threads, much as automatic variables have
> - distinct values and addresses in each subprogram invocation. Typically,
> - there is a single block of storage containing all thread-local variables
> - declared in the main executable, and a separate block for the variables
> - declared in each shared library. Each thread-local variable can then be
> - accessed in its block using an identifier. This identifier is typically a
> - byte offset into the block and pushed onto the DWARF stack by one of the*
> - ``DW_OP_const*`` *operations prior to the* ``DW_OP_form_tls_address``
> - *operation. Computing the address of the appropriate block can be complex
> - (in some cases, the compiler emits a function call to do it), and
> diff icult
> - to describe using ordinary DWARF location descriptions. Instead of forcing
> - complex thread-local storage calculations into the DWARF expressions, the*
> - ``DW_OP_form_tls_address`` *allows the consumer to perform the computation
> - based on the target architecture specific run-time environment.*
> -
> -5. ``DW_OP_call_frame_cfa``
> -
> - ``DW_OP_call_frame_cfa`` pushes the location description L of the Canonical
> - Frame Address (CFA) of the current subprogram, obtained from the Call Frame
> - Information on the stack. See :ref:`amdgpu-dwarf-call-frame-information`.
> -
> - *Although the value of the* ``DW_AT_frame_base`` *attribute of the debugger
> - information entry corresponding to the current subprogram can be computed
> - using a location list expression, in some cases this would require an
> - extensive location list because the values of the registers used in
> - computing the CFA change during a subprogram execution. If the Call Frame
> - Information is present, then it already encodes such changes, and it is
> - space efficient to reference that using the* ``DW_OP_call_frame_cfa``
> - *operation.*
> -
> -6. ``DW_OP_fbreg``
> -
> - ``DW_OP_fbreg`` has a single signed LEB128 integer operand that represents a
> - byte displacement B.
> -
> - The location description L for the *frame base* of the current subprogram is
> - obtained from the ``DW_AT_frame_base`` attribute of the debugger information
> - entry corresponding to the current subprogram as described in
> - :ref:`amdgpu-dwarf-debugging-information-entry-attributes`.
> -
> - The location description L is updated as if the ``DW_OP_LLVM_offset_constu
> - B`` operation was applied. The updated L is pushed on the stack.
> -
> -7. ``DW_OP_breg0``, ``DW_OP_breg1``, ..., ``DW_OP_breg31``
> -
> - The ``DW_OP_breg<N>`` operations encode the numbers of up to 32 registers,
> - numbered from 0 through 31, inclusive. The register number R corresponds to
> - the N in the operation name.
> -
> - They have a single signed LEB128 integer operand that represents a byte
> - displacement B.
> -
> - The address space identifier AS is defined as the one corresponding to the
> - target architecture specific default address space.
> -
> - The address size S is defined as the address bit size of the target
> - architecture specific address space corresponding to AS.
> -
> - The contents of the register specified by R are retrieved as a
> - twos-complement unsigned value and zero extended to S bits. B is added and
> - the least significant S bits are treated as a twos-complement unsigned value
> - to be used as an address A.
> -
> - They push a location description L comprising one memory location
> - description LS on the stack. LS specifies the memory location storage that
> - corresponds to AS with a bit offset equal to A scaled by 8 (the byte size).
> -
> -8. ``DW_OP_bregx``
> -
> - ``DW_OP_bregx`` has two operands. The first is an unsigned LEB128 integer
> - that represents a register number R. The second is a signed LEB128
> - integer that represents a byte displacement B.
> -
> - The action is the same as for ``DW_OP_breg<N>`` except that R is used as the
> - register number and B is used as the byte displacement.
> -
> -9. ``DW_OP_LLVM_aspace_bregx`` *New*
> -
> - ``DW_OP_LLVM_aspace_bregx`` has two operands. The first is an unsigned
> - LEB128 integer that represents a register number R. The second is a signed
> - LEB128 integer that represents a byte displacement B. It pops one stack
> - entry that is required to be an integral type value that represents a target
> - architecture specific address space identifier AS.
> -
> - The action is the same as for ``DW_OP_breg<N>`` except that R is used as the
> - register number, B is used as the byte displacement, and AS is used as the
> - address space identifier.
> -
> - The DWARF expression is ill-formed if AS is not one of the values defined by
> - the target architecture specific ``DW_ASPACE_*`` values.
> -
> - .. note::
> -
> - Could also consider adding ``DW_OP_aspace_breg0, DW_OP_aspace_breg1, ...,
> - DW_OP_aspace_bref31`` which would save encoding size.
> -
> -.. _amdgpu-dwarf-register-location-descriptions:
> -
> -Register Location Description Operations
> -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -There is a register location storage that corresponds to each of the target
> -architecture registers. The size of each register location storage corresponds
> -to the size of the corresponding target architecture register.
> -
> -A register location description specifies a register location storage. The bit
> -offset corresponds to a bit position within the register. Bits accessed using a
> -register location description access the corresponding target architecture
> -register starting at the specified bit offset.
> -
> -1. ``DW_OP_reg0``, ``DW_OP_reg1``, ..., ``DW_OP_reg31``
> -
> - ``DW_OP_reg<N>`` operations encode the numbers of up to 32 registers,
> - numbered from 0 through 31, inclusive. The target architecture register
> - number R corresponds to the N in the operation name.
> -
> - They push a location description L that specifies one register location
> - description SL on the stack. SL specifies the register location storage that
> - corresponds to R with a bit offset of 0.
> -
> -2. ``DW_OP_regx``
> -
> - ``DW_OP_regx`` has a single unsigned LEB128 integer operand that represents
> - a target architecture register number R.
> -
> - It pushes a location description L that specifies one register location
> - description SL on the stack. SL specifies the register location storage that
> - corresponds to R with a bit offset of 0.
> -
> -*These operations obtain a register location. To fetch the contents of a
> -register, it is necessary to use* ``DW_OP_regval_type``\ *, use one of the*
> -``DW_OP_breg*`` *register-based addressing operations, or use* ``DW_OP_deref*``
> -*on a register location description.*
> -
> -.. _amdgpu-dwarf-implicit-location-descriptions:
> -
> -Implicit Location Description Operations
> -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -Implicit location storage represents a piece or all of an object which has no
> -actual location in the program but whose contents are nonetheless known, either
> -as a constant or can be computed from other locations and values in the program.
> -
> -An implicit location description specifies an implicit location storage. The bit
> -offset corresponds to a bit position within the implicit location storage. Bits
> -accessed using an implicit location description, access the corresponding
> -implicit storage value starting at the bit offset.
> -
> -1. ``DW_OP_implicit_value``
> -
> - ``DW_OP_implicit_value`` has two operands. The first is an unsigned LEB128
> - integer that represents a byte size S. The second is a block of bytes with a
> - length equal to S treated as a literal value V.
> -
> - An implicit location storage LS is created with the literal value V and a
> - size of S.
> -
> - It pushes location description L with one implicit location description SL
> - on the stack. SL specifies LS with a bit offset of 0.
> -
> -2. ``DW_OP_stack_value``
> -
> - ``DW_OP_stack_value`` pops one stack entry that must be a value V.
> -
> - An implicit location storage LS is created with the literal value V and a
> - size equal to V's base type size.
> -
> - It pushes a location description L with one implicit location description SL
> - on the stack. SL specifies LS with a bit offset of 0.
> -
> - *The* ``DW_OP_stack_value`` *operation specifies that the object does not
> - exist in memory, but its value is nonetheless known. In this form, the
> - location description specifies the actual value of the object, rather than
> - specifying the memory or register storage that holds the value.*
> -
> - See :ref:`amdgpu-dwarf-implicit-location-descriptions` for special rules
> - concerning implicit pointer values produced by dereferencing implicit
> - location descriptions created by the ``DW_OP_implicit_pointer`` and
> - ``DW_OP_LLVM_implicit_aspace_pointer`` operations.
> -
> - .. note::
> -
> - Since location descriptions are allowed on the stack, the
> - ``DW_OP_stack_value`` operation no longer terminates the DWARF operation
> - expression execution as in DWARF Version 5.
> -
> -3. ``DW_OP_implicit_pointer``
> -
> - *An optimizing compiler may eliminate a pointer, while still retaining the
> - value that the pointer addressed.* ``DW_OP_implicit_pointer`` *allows a
> - producer to describe this value.*
> -
> - ``DW_OP_implicit_pointer`` *specifies an object is a pointer to the target
> - architecture default address space that cannot be represented as a real
> - pointer, even though the value it would point to can be described. In this
> - form, the location description specifies a debugging information entry that
> - represents the actual location description of the object to which the
> - pointer would point. Thus, a consumer of the debug information would be able
> - to access the dereferenced pointer, even when it cannot access the pointer
> - itself.*
> -
> - ``DW_OP_implicit_pointer`` has two operands. The first is a 4-byte unsigned
> - value in the 32-bit DWARF format, or an 8-byte unsigned value in the 64-bit
> - DWARF format, that represents a debugging information entry reference R. The
> - second is a signed LEB128 integer that represents a byte displacement B.
> -
> - R is used as the offset of a debugging information entry D in a
> - ``.debug_info`` section, which may be contained in an executable or shared
> - object file other than that containing the operation. For references from one
> - executable or shared object file to another, the relocation must be
> - performed by the consumer.
> -
> - *The first operand interpretation is exactly like that for*
> - ``DW_FORM_ref_addr``\ *.*
> -
> - The address space identifier AS is defined as the one corresponding to the
> - target architecture specific default address space.
> -
> - The address size S is defined as the address bit size of the target
> - architecture specific address space corresponding to AS.
> -
> - An implicit location storage LS is created with the debugging information
> - entry D, address space AS, and size of S.
> -
> - It pushes a location description L that comprises one implicit location
> - description SL on the stack. SL specifies LS with a bit offset of 0.
> -
> - If a ``DW_OP_deref*`` operation pops a location description L', and
> - retrieves S bits where both:
> -
> - 1. All retrieved bits come from an implicit location description that
> - refers to an implicit location storage that is the same as LS.
> -
> - *Note that all bits do not have to come from the same implicit location
> - description, as L' may involve composite location descriptors.*
> -
> - 2. The bits come from consecutive ascending offsets within their respective
> - implicit location storage.
> -
> - *These rules are equivalent to retrieving the complete contents of LS.*
> -
> - Then the value V pushed by the ``DW_OP_deref*`` operation is an implicit
> - pointer value IPV with a target architecture specific address space of AS, a
> - debugging information entry of D, and a base type of T. If AS is the target
> - architecture default address space, then T is the generic type. Otherwise, T
> - is a target architecture specific integral type with a bit size equal to S.
> -
> - Otherwise, if a ``DW_OP_deref*`` operation is applied to a location
> - description such that some retrieved bits come from an implicit location
> - storage that is the same as LS, then the DWARF expression is ill-formed.
> -
> - If IPV is either implicitly converted to a location description (only done
> - if AS is the target architecture default address space) or used by
> - ``DW_OP_LLVM_form_aspace_address`` (only done if the address space specified
> - is AS), then the resulting location description RL is:
> -
> - * If D has a ``DW_AT_location`` attribute, the DWARF expression E from the
> - ``DW_AT_location`` attribute is evaluated as a location description. The
> - current subprogram and current program location of the evaluation context
> - that is accessing IPV is used for the evaluation context of E, together
> - with an empty initial stack. RL is the expression result.
> -
> - * If D has a ``DW_AT_const_value`` attribute, then an implicit location
> - storage RLS is created from the ``DW_AT_const_value`` attribute's value
> - with a size matching the size of the ``DW_AT_const_value`` attribute's
> - value. RL comprises one implicit location description SRL. SRL specifies
> - RLS with a bit offset of 0.
> -
> - .. note::
> -
> - If using ``DW_AT_const_value`` for variables and formal parameters is
> - deprecated and instead ``DW_AT_location`` is used with an implicit
> - location description, then this rule would not be required.
> -
> - * Otherwise the DWARF expression is ill-formed.
> -
> - The bit offset of RL is updated as if the ``DW_OP_LLVM_offset_constu B``
> - operation was applied.
> -
> - If a ``DW_OP_stack_value`` operation pops a value that is the same as IPV,
> - then it pushes a location description that is the same as L.
> -
> - The DWARF expression is ill-formed if it accesses LS or IPV in any other
> - manner.
> -
> - *The restrictions on how an implicit pointer location description created
> - by* ``DW_OP_implicit_pointer`` *and* ``DW_OP_LLVM_aspace_implicit_pointer``
> - *can be used are to simplify the DWARF consumer. Similarly, for an implicit
> - pointer value created by* ``DW_OP_deref*`` *and* ``DW_OP_stack_value``\ .*
> -
> -4. ``DW_OP_LLVM_aspace_implicit_pointer`` *New*
> -
> - ``DW_OP_LLVM_aspace_implicit_pointer`` has two operands that are the same as
> - for ``DW_OP_implicit_pointer``.
> -
> - It pops one stack entry that must be an integral type value that represents
> - a target architecture specific address space identifier AS.
> -
> - The location description L that is pushed on the stack is the same as for
> - ``DW_OP_implicit_pointer`` except that the address space identifier used is
> - AS.
> -
> - The DWARF expression is ill-formed if AS is not one of the values defined by
> - the target architecture specific ``DW_ASPACE_*`` values.
> -
> -*Typically a* ``DW_OP_implicit_pointer`` *or*
> -``DW_OP_LLVM_aspace_implicit_pointer`` *operation is used in a DWARF expression
> -E*\ :sub:`1` *of a* ``DW_TAG_variable`` *or* ``DW_TAG_formal_parameter``
> -*debugging information entry D*\ :sub:`1`\ *'s* ``DW_AT_location`` *attribute.
> -The debugging information entry referenced by the* ``DW_OP_implicit_pointer``
> -*or* ``DW_OP_LLVM_aspace_implicit_pointer`` *operations is typically itself a*
> -``DW_TAG_variable`` *or* ``DW_TAG_formal_parameter`` *debugging information
> -entry D*\ :sub:`2` *whose* ``DW_AT_location`` *attribute gives a second DWARF
> -expression E*\ :sub:`2`\ *.*
> -
> -*D*\ :sub:`1` *and E*\ :sub:`1` *are describing the location of a pointer type
> -object. D*\ :sub:`2` *and E*\ :sub:`2` *are describing the location of the
> -object pointed to by that pointer object.*
> -
> -*However, D*\ :sub:`2` *may be any debugging information entry that contains a*
> -``DW_AT_location`` *or* ``DW_AT_const_value`` *attribute (for example,*
> -``DW_TAG_dwarf_procedure``\ *). By using E*\ :sub:`2`\ *, a consumer can
> -reconstruct the value of the object when asked to dereference the pointer
> -described by E*\ :sub:`1` *which contains the* ``DW_OP_implicit_pointer`` or
> -``DW_OP_LLVM_aspace_implicit_pointer`` *operation.*
> -
> -Composite Location Description Operations
> -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -A composite location storage represents an object or value which may be
> -contained in part of another location storage or contained in parts of more
> -than one location storage.
> -
> -Each part has a part location description L and a part bit size S. L can have
> -one or more single location descriptions SL. If there are more than one SL then
> -that indicates that part is located in more than one place. The bits of each
> -place of the part comprise S contiguous bits from the location storage LS
> -specified by SL starting at the bit offset specified by SL. All the bits must
> -be within the size of LS or the DWARF expression is ill-formed.
> -
> -A composite location storage can have zero or more parts. The parts are
> -contiguous such that the zero-based location storage bit index will range over
> -each part with no gaps between them. Therefore, the size of a composite location
> -storage is the sum of the size of its parts. The DWARF expression is ill-formed
> -if the size of the contiguous location storage is larger than the size of the
> -memory location storage corresponding to the largest target architecture
> -specific address space.
> -
> -A composite location description specifies a composite location storage. The bit
> -offset corresponds to a bit position within the composite location storage.
> -
> -There are operations that create a composite location storage.
> -
> -There are other operations that allow a composite location storage to be
> -incrementally created. Each part is created by a separate operation. There may
> -be one or more operations to create the final composite location storage. A
> -series of such operations describes the parts of the composite location storage
> -that are in the order that the associated part operations are executed.
> -
> -To support incremental creation, a composite location storage can be in an
> -incomplete state. When an incremental operation operates on an incomplete
> -composite location storage, it adds a new part, otherwise it creates a new
> -composite location storage. The ``DW_OP_LLVM_piece_end`` operation explicitly
> -makes an incomplete composite location storage complete.
> -
> -A composite location description that specifies a composite location storage
> -that is incomplete is termed an incomplete composite location description. A
> -composite location description that specifies a composite location storage that
> -is complete is termed a complete composite location description.
> -
> -If the top stack entry is a location description that has one incomplete
> -composite location description SL after the execution of an operation expression
> -has completed, SL is converted to a complete composite location description.
> -
> -*Note that this conversion does not happen after the completion of an operation
> -expression that is evaluated on the same stack by the* ``DW_OP_call*``
> -*operations. Such executions are not a separate evaluation of an operation
> -expression, but rather the continued evaluation of the same operation expression
> -that contains the* ``DW_OP_call*`` *operation.*
> -
> -If a stack entry is required to be a location description L, but L has an
> -incomplete composite location description, then the DWARF expression is
> -ill-formed. The exception is for the operations involved in incrementally
> -creating a composite location description as described below.
> -
> -*Note that a DWARF operation expression may arbitrarily compose composite
> -location descriptions from any other location description, including those that
> -have multiple single location descriptions, and those that have composite
> -location descriptions.*
> -
> -*The incremental composite location description operations are defined to be
> -compatible with the definitions in DWARF Version 5.*
> -
> -1. ``DW_OP_piece``
> -
> - ``DW_OP_piece`` has a single unsigned LEB128 integer that represents a byte
> - size S.
> -
> - The action is based on the context:
> -
> - * If the stack is empty, then a location description L comprised of one
> - incomplete composite location description SL is pushed on the stack.
> -
> - An incomplete composite location storage LS is created with a single part
> - P. P specifies a location description PL and has a bit size of S scaled by
> - 8 (the byte size). PL is comprised of one undefined location description
> - PSL.
> -
> - SL specifies LS with a bit offset of 0.
> -
> - * Otherwise, if the top stack entry is a location description L comprised of
> - one incomplete composite location description SL, then the incomplete
> - composite location storage LS that SL specifies is updated to append a new
> - part P. P specifies a location description PL and has a bit size of S
> - scaled by 8 (the byte size). PL is comprised of one undefined location
> - description PSL. L is left on the stack.
> -
> - * Otherwise, if the top stack entry is a location description or can be
> - converted to one, then it is popped and treated as a part location
> - description PL. Then:
> -
> - * If the top stack entry (after popping PL) is a location description L
> - comprised of one incomplete composite location description SL, then the
> - incomplete composite location storage LS that SL specifies is updated to
> - append a new part P. P specifies the location description PL and has a
> - bit size of S scaled by 8 (the byte size). L is left on the stack.
> -
> - * Otherwise, a location description L comprised of one incomplete
> - composite location description SL is pushed on the stack.
> -
> - An incomplete composite location storage LS is created with a single
> - part P. P specifies the location description PL and has a bit size of S
> - scaled by 8 (the byte size).
> -
> - SL specifies LS with a bit offset of 0.
> -
> - * Otherwise, the DWARF expression is ill-formed
> -
> - *Many compilers store a single variable in sets of registers or store a
> - variable partially in memory and partially in registers.* ``DW_OP_piece``
> - *provides a way of describing where a part of a variable is located.*
> -
> - *If a non-0 byte displacement is required, the* ``DW_OP_LLVM_offset``
> - *operation can be used to update the location description before using it as
> - the part location description of a* ``DW_OP_piece`` *operation.*
> -
> - *The evaluation rules for the* ``DW_OP_piece`` *operation allow it to be
> - compatible with the DWARF Version 5 definition.*
> -
> - .. note::
> -
> - Since this proposal allows location descriptions to be entries on the
> - stack, a simpler operation to create composite location descriptions. For
> - example, just one operation that specifies how many parts, and pops pairs
> - of stack entries for the part size and location description. Not only
> - would this be a simpler operation and avoid the complexities of incomplete
> - composite location descriptions, but it may also have a smaller encoding
> - in practice. However, the desire for compatibility with DWARF Version 5 is
> - likely a stronger consideration.
> -
> -2. ``DW_OP_bit_piece``
> -
> - ``DW_OP_bit_piece`` has two operands. The first is an unsigned LEB128
> - integer that represents the part bit size S. The second is an unsigned
> - LEB128 integer that represents a bit displacement B.
> -
> - The action is the same as for ``DW_OP_piece`` except that any part created
> - has the bit size S, and the location description PL of any created part is
> - updated as if the ``DW_OP_constu B; DW_OP_LLVM_bit_offset`` operations were
> - applied.
> -
> - ``DW_OP_bit_piece`` *is used instead of* ``DW_OP_piece`` *when the piece to
> - be assembled is not byte-sized or is not at the start of the part location
> - description.*
> -
> - *If a computed bit displacement is required, the* ``DW_OP_LLVM_bit_offset``
> - *operation can be used to update the location description before using it as
> - the part location description of a* ``DW_OP_bit_piece`` *operation.*
> -
> - .. note::
> -
> - The bit offset operand is not needed as ``DW_OP_LLVM_bit_offset`` can be
> - used on the part's location description.
> -
> -3. ``DW_OP_LLVM_piece_end`` *New*
> -
> - If the top stack entry is not a location description L comprised of one
> - incomplete composite location description SL, then the DWARF expression is
> - ill-formed.
> -
> - Otherwise, the incomplete composite location storage LS specified by SL is
> - updated to be a complete composite location description with the same parts.
> -
> -4. ``DW_OP_LLVM_extend`` *New*
> -
> - ``DW_OP_LLVM_extend`` has two operands. The first is an unsigned LEB128
> - integer that represents the element bit size S. The second is an unsigned
> - LEB128 integer that represents a count C.
> -
> - It pops one stack entry that must be a location description and is treated
> - as the part location description PL.
> -
> - A location description L comprised of one complete composite location
> - description SL is pushed on the stack.
> -
> - A complete composite location storage LS is created with C identical parts
> - P. Each P specifies PL and has a bit size of S.
> -
> - SL specifies LS with a bit offset of 0.
> -
> - The DWARF expression is ill-formed if the element bit size or count are 0.
> -
> -5. ``DW_OP_LLVM_select_bit_piece`` *New*
> -
> - ``DW_OP_LLVM_select_bit_piece`` has two operands. The first is an unsigned
> - LEB128 integer that represents the element bit size S. The second is an
> - unsigned LEB128 integer that represents a count C.
> -
> - It pops three stack entries. The first must be an integral type value that
> - represents a bit mask value M. The second must be a location description
> - that represents the one-location description L1. The third must be a
> - location description that represents the zero-location description L0.
> -
> - A complete composite location storage LS is created with C parts P\ :sub:`N`
> - ordered in ascending N from 0 to C-1 inclusive. Each P\ :sub:`N` specifies
> - location description PL\ :sub:`N` and has a bit size of S.
> -
> - PL\ :sub:`N` is as if the ``DW_OP_LLVM_bit_offset N*S`` operation was
> - applied to PLX\ :sub:`N`\ .
> -
> - PLX\ :sub:`N` is the same as L0 if the N\ :sup:`th` least significant bit of
> - M is a zero, otherwise it is the same as L1.
> -
> - A location description L comprised of one complete composite location
> - description SL is pushed on the stack. SL specifies LS with a bit offset of
> - 0.
> -
> - The DWARF expression is ill-formed if S or C are 0, or if the bit size of M
> - is less than C.
> -
> -.. _amdgpu-dwarf-location-list-expressions:
> -
> -DWARF Location List Expressions
> -+++++++++++++++++++++++++++++++
> -
> -*To meet the needs of recent computer architectures and optimization techniques,
> -debugging information must be able to describe the location of an object whose
> -location changes over the object’s lifetime, and may reside at multiple
> -locations during parts of an object's lifetime. Location list expressions are
> -used in place of operation expressions whenever the object whose location is
> -being described has these requirements.*
> -
> -A location list expression consists of a series of location list entries. Each
> -location list entry is one of the following kinds:
> -
> -*Bounded location description*
> -
> - This kind of location list entry provides an operation expression that
> - evaluates to the location description of an object that is valid over a
> - lifetime bounded by a starting and ending address. The starting address is the
> - lowest address of the address range over which the location is valid. The
> - ending address is the address of the first location past the highest address
> - of the address range.
> -
> - The location list entry matches when the current program location is within
> - the given range.
> -
> - There are several kinds of bounded location description entries which
> diff er
> - in the way that they specify the starting and ending addresses.
> -
> -*Default location description*
> -
> - This kind of location list entry provides an operation expression that
> - evaluates to the location description of an object that is valid when no
> - bounded location description entry applies.
> -
> - The location list entry matches when the current program location is not
> - within the range of any bounded location description entry.
> -
> -*Base address*
> -
> - This kind of location list entry provides an address to be used as the base
> - address for beginning and ending address offsets given in certain kinds of
> - bounded location description entries. The applicable base address of a bounded
> - location description entry is the address specified by the closest preceding
> - base address entry in the same location list. If there is no preceding base
> - address entry, then the applicable base address defaults to the base address
> - of the compilation unit (see DWARF Version 5 section 3.1.1).
> -
> - In the case of a compilation unit where all of the machine code is contained
> - in a single contiguous section, no base address entry is needed.
> -
> -*End-of-list*
> -
> - This kind of location list entry marks the end of the location list
> - expression.
> -
> -The address ranges defined by the bounded location description entries of a
> -location list expression may overlap. When they do, they describe a situation in
> -which an object exists simultaneously in more than one place.
> -
> -If all of the address ranges in a given location list expression do not
> -collectively cover the entire range over which the object in question is
> -defined, and there is no following default location description entry, it is
> -assumed that the object is not available for the portion of the range that is
> -not covered.
> -
> -The operation expression of each matching location list entry is evaluated as a
> -location description and its result is returned as the result of the location
> -list entry. The operation expression is evaluated with the same context as the
> -location list expression, including the same current frame, current program
> -location, and initial stack.
> -
> -The result of the evaluation of a DWARF location list expression is a location
> -description that is comprised of the union of the single location descriptions
> -of the location description result of each matching location list entry. If
> -there are no matching location list entries, then the result is a location
> -description that comprises one undefined location description.
> -
> -A location list expression can only be used as the value of a debugger
> -information entry attribute that is encoded using class ``loclist`` or
> -``loclistsptr`` (see DWARF Version 5 section 7.5.5). The value of the attribute
> -provides an index into a separate object file section called ``.debug_loclists``
> -or ``.debug_loclists.dwo`` (for split DWARF object files) that contains the
> -location list entries.
> -
> -A ``DW_OP_call*`` and ``DW_OP_implicit_pointer`` operation can be used to
> -specify a debugger information entry attribute that has a location list
> -expression. Several debugger information entry attributes allow DWARF
> -expressions that are evaluated with an initial stack that includes a location
> -description that may originate from the evaluation of a location list
> -expression.
> -
> -*This location list representation, the* ``loclist`` *and* ``loclistsptr``
> -*class, and the related* ``DW_AT_loclists_base`` *attribute are new in DWARF
> -Version 5. Together they eliminate most, or all of the code object relocations
> -previously needed for location list expressions.*
> -
> -.. note::
> -
> - The rest of this section is the same as DWARF Version 5 section 2.6.2.
> -
> -.. _amdgpu-dwarf-segment_addresses:
> -
> -Segmented Addresses
> -~~~~~~~~~~~~~~~~~~~
> -
> -.. note::
> -
> - This augments DWARF Version 5 section 2.12.
> -
> -DWARF address classes are used for source languages that have the concept of
> -memory spaces. They are used in the ``DW_AT_address_class`` attribute for
> -pointer type, reference type, subprogram, and subprogram type debugger
> -information entries.
> -
> -Each DWARF address class is conceptually a separate source language memory space
> -with its own lifetime and aliasing rules. DWARF address classes are used to
> -specify the source language memory spaces that pointer type and reference type
> -values refer, and to specify the source language memory space in which variables
> -are allocated.
> -
> -The set of currently defined source language DWARF address classes, together
> -with source language mappings, is given in
> -:ref:`amdgpu-dwarf-address-class-table`.
> -
> -Vendor defined source language address classes may be defined using codes in the
> -range ``DW_ADDR_LLVM_lo_user`` to ``DW_ADDR_LLVM_hi_user``.
> -
> -.. table:: Address class
> - :name: amdgpu-dwarf-address-class-table
> -
> - ========================= ============ ========= ========= =========
> - Address Class Name Meaning C/C++ OpenCL CUDA/HIP
> - ========================= ============ ========= ========= =========
> - ``DW_ADDR_none`` generic *default* generic *default*
> - ``DW_ADDR_LLVM_global`` global global
> - ``DW_ADDR_LLVM_constant`` constant constant constant
> - ``DW_ADDR_LLVM_group`` thread-group local shared
> - ``DW_ADDR_LLVM_private`` thread private
> - ``DW_ADDR_LLVM_lo_user``
> - ``DW_ADDR_LLVM_hi_user``
> - ========================= ============ ========= ========= =========
> -
> -DWARF address spaces correspond to target architecture specific linear
> -addressable memory areas. They are used in DWARF expression location
> -descriptions to describe in which target architecture specific memory area data
> -resides.
> -
> -*Target architecture specific DWARF address spaces may correspond to hardware
> -supported facilities such as memory utilizing base address registers, scratchpad
> -memory, and memory with special interleaving. The size of addresses in these
> -address spaces may vary. Their access and allocation may be hardware managed
> -with each thread or group of threads having access to independent storage. For
> -these reasons they may have properties that do not allow them to be viewed as
> -part of the unified global virtual address space accessible by all threads.*
> -
> -*It is target architecture specific whether multiple DWARF address spaces are
> -supported and how source language DWARF address classes map to target
> -architecture specific DWARF address spaces. A target architecture may map
> -multiple source language DWARF address classes to the same target architecture
> -specific DWARF address class. Optimization may determine that variable lifetime
> -and access pattern allows them to be allocated in faster scratchpad memory
> -represented by a
> diff erent DWARF address space.*
> -
> -Although DWARF address space identifiers are target architecture specific,
> -``DW_ASPACE_none`` is a common address space supported by all target
> -architectures.
> -
> -DWARF address space identifiers are used by:
> -
> -* The DWARF expession operations: ``DW_OP_LLVM_aspace_bregx``,
> - ``DW_OP_LLVM_form_aspace_address``, ``DW_OP_LLVM_implicit_aspace_pointer``,
> - and ``DW_OP_xderef*``.
> -
> -* The CFI instructions: ``DW_CFA_def_aspace_cfa`` and
> - ``DW_CFA_def_aspace_cfa_sf``.
> -
> -.. note::
> -
> - With the definition of DWARF address classes and DWARF address spaces in this
> - proposal, DWARF Version 5 table 2.7 needs to be updated. It seems it is an
> - example of DWARF address spaces and not DWARF address classes.
> -
> -.. note::
> -
> - With the expanded support for DWARF address spaces in this proposal, it may be
> - worth examining if DWARF segments can be eliminated and DWARF address spaces
> - used instead.
> -
> - That may involve extending DWARF address spaces to also be used to specify
> - code locations. In target architectures that use
> diff erent memory areas for
> - code and data this would seem a natural use for DWARF address spaces. This
> - would allow DWARF expression location descriptions to be used to describe the
> - location of subprograms and entry points that are used in expressions
> - involving subprogram pointer type values.
> -
> - Currently, DWARF expressions assume data and code resides in the same default
> - DWARF address space, and only the address ranges in DWARF location list
> - entries and in the ``.debug_aranges`` section for accelerated access for
> - addresses allow DWARF segments to be used to distinguish.
> -
> -.. note::
> -
> - Currently, DWARF defines address class values as being target architecture
> - specific. It is unclear how language specific memory spaces are intended to be
> - represented in DWARF using these.
> -
> - For example, OpenCL defines memory spaces (called address spaces in OpenCL)
> - for ``global``, ``local``, ``constant``, and ``private``. These are part of
> - the type system and are modifiers to pointer types. In addition, OpenCL
> - defines ``generic`` pointers that can reference either the ``global``,
> - ``local``, or ``private`` memory spaces. To support the OpenCL language the
> - debugger would want to support casting pointers between the ``generic`` and
> - other memory spaces, querying what memory space a ``generic`` pointer value is
> - currently referencing, and possibly using pointer casting to form an address
> - for a specific memory space out of an integral value.
> -
> - The method to use to dereference a pointer type or reference type value is
> - defined in DWARF expressions using ``DW_OP_xderef*`` which uses a target
> - architecture specific address space.
> -
> - DWARF defines the ``DW_AT_address_class`` attribute on pointer type and
> - reference type debugger information entries. It specifies the method to use to
> - dereference them. Why is the value of this not the same as the address space
> - value used in ``DW_OP_xderef*``? In both cases it is target architecture
> - specific and the architecture presumably will use the same set of methods to
> - dereference pointers in both cases.
> -
> - Since ``DW_AT_address_class`` uses a target architecture specific value, it
> - cannot in general capture the source language memory space type modifier
> - concept. On some architectures all source language memory space modifiers may
> - actually use the same method for dereferencing pointers.
> -
> - One possibility is for DWARF to add an ``DW_TAG_LLVM_address_class_type``
> - debugger information entry type modifier that can be applied to a pointer type
> - and reference type. The ``DW_AT_address_class`` attribute could be re-defined
> - to not be target architecture specific and instead define generalized language
> - values (as is proposed above for DWARF address classes in the table
> - :ref:`amdgpu-dwarf-address-class-table`) that will support OpenCL and other
> - languages using memory spaces. The ``DW_AT_address_class`` attribute could be
> - defined to not be applied to pointer types or reference types, but instead
> - only to the new ``DW_TAG_LLVM_address_class_type`` type modifier debugger
> - information entry.
> -
> - If a pointer type or reference type is not modified by
> - ``DW_TAG_LLVM_address_class_type`` or if ``DW_TAG_LLVM_address_class_type``
> - has no ``DW_AT_address_class`` attribute, then the pointer type or reference
> - type would be defined to use the ``DW_ADDR_none`` address class as currently.
> - Since modifiers can be chained, it would need to be defined if multiple
> - ``DW_TAG_LLVM_address_class_type`` modifiers were legal, and if so if the
> - outermost one is the one that takes precedence.
> -
> - A target architecture implementation that supports multiple address spaces
> - would need to map ``DW_ADDR_none`` appropriately to support CUDA-like
> - languages that have no address classes in the type system but do support
> - variable allocation in address classes. Such variable allocation would result
> - in the variable's location description needing an address space.
> -
> - The approach proposed in :ref:`amdgpu-dwarf-address-class-table` is to define
> - the default ``DW_ADDR_none`` to be the generic address class and not the
> - global address class. This matches how CLANG and LLVM have added support for
> - CUDA-like languages on top of existing C++ language support. This allows all
> - addresses to be generic by default which matches CUDA-like languages.
> -
> - An alternative approach is to define ``DW_ADDR_none`` as being the global
> - address class and then change ``DW_ADDR_LLVM_global`` to
> - ``DW_ADDR_LLVM_generic``. This would match the reality that languages that do
> - not support multiple memory spaces only have one default global memory space.
> - Generally, in these languages if they expose that the target architecture
> - supports multiple address spaces, the default one is still the global memory
> - space. Then a language that does support multiple memory spaces has to
> - explicitly indicate which pointers have the added ability to reference more
> - than the global memory space. However, compilers generating DWARF for
> - CUDA-like languages would then have to define every CUDA-like language pointer
> - type or reference type using ``DW_TAG_LLVM_address_class_type`` with a
> - ``DW_AT_address_class`` attribute of ``DW_ADDR_LLVM_generic`` to match the
> - language semantics.
> -
> - A new ``DW_AT_LLVM_address_space`` attribute could be defined that can be
> - applied to pointer type, reference type, subprogram, and subprogram type to
> - describe how objects having the given type are dereferenced or called (the
> - role that ``DW_AT_address_class`` currently provides). The values of
> - ``DW_AT_address_space`` would be target architecture specific and the same as
> - used in ``DW_OP_xderef*``.
> -
> -.. _amdgpu-dwarf-debugging-information-entry-attributes:
> -
> -Debugging Information Entry Attributes
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> -
> -.. note::
> -
> - This section provides changes to existing debugger information entry
> - attributes and defines attributes added by the proposal. These would be
> - incorporated into the appropriate DWARF Version 5 chapter 2 sections.
> -
> -1. ``DW_AT_location``
> -
> - Any debugging information entry describing a data object (which includes
> - variables and parameters) or common blocks may have a ``DW_AT_location``
> - attribute, whose value is a DWARF expression E.
> -
> - The result of the attribute is obtained by evaluating E as a location
> - description in the context of the current subprogram, current program
> - location, and with an empty initial stack. See
> - :ref:`amdgpu-dwarf-expressions`.
> -
> - See :ref:`amdgpu-dwarf-control-flow-operations` for special evaluation rules
> - used by the ``DW_OP_call*`` operations.
> -
> - .. note::
> -
> - Delete the description of how the ``DW_OP_call*`` operations evaluate a
> - ``DW_AT_location`` attribute as that is now described in the operations.
> -
> - .. note::
> -
> - See the discussion about the ``DW_AT_location`` attribute in the
> - ``DW_OP_call*`` operation. Having each attribute only have a single
> - purpose and single execution semantics seems desirable. It makes it easier
> - for the consumer that no longer have to track the context. It makes it
> - easier for the producer as it can rely on a single semantics for each
> - attribute.
> -
> - For that reason, limiting the ``DW_AT_location`` attribute to only
> - supporting evaluating the location description of an object, and using a
> -
> diff erent attribute and encoding class for the evaluation of DWARF
> - expression *procedures* on the same operation expression stack seems
> - desirable.
> -
> -2. ``DW_AT_const_value``
> -
> - .. note::
> -
> - Could deprecate using the ``DW_AT_const_value`` attribute for
> - ``DW_TAG_variable`` or ``DW_TAG_formal_parameter`` debugger information
> - entries that have been optimized to a constant. Instead,
> - ``DW_AT_location`` could be used with a DWARF expression that produces an
> - implicit location description now that any location description can be
> - used within a DWARF expression. This allows the ``DW_OP_call*`` operations
> - to be used to push the location description of any variable regardless of
> - how it is optimized.
> -
> -3. ``DW_AT_frame_base``
> -
> - A ``DW_TAG_subprogram`` or ``DW_TAG_entry_point`` debugger information entry
> - may have a ``DW_AT_frame_base`` attribute, whose value is a DWARF expression
> - E.
> -
> - The result of the attribute is obtained by evaluating E as a location
> - description in the context of the current subprogram, current program
> - location, and with an empty initial stack.
> -
> - The DWARF is ill-formed if E contains an ``DW_OP_fbreg`` operation, or the
> - resulting location description L is not comprised of one single location
> - description SL.
> -
> - If SL a register location description for register R, then L is replaced
> - with the result of evaluating a ``DW_OP_bregx R, 0`` operation. This
> - computes the frame base memory location description in the target
> - architecture default address space.
> -
> - *This allows the more compact* ``DW_OPreg*`` *to be used instead of*
> - ``DW_OP_breg* 0``\ *.*
> -
> - .. note::
> -
> - This rule could be removed and require the producer to create the required
> - location description directly using ``DW_OP_call_frame_cfa``,
> - ``DW_OP_breg*``, or ``DW_OP_LLVM_aspace_bregx``. This would also then
> - allow a target to implement the call frames within a large register.
> -
> - Otherwise, the DWARF is ill-formed if SL is not a memory location
> - description in any of the target architecture specific address spaces.
> -
> - The resulting L is the *frame base* for the subprogram or entry point.
> -
> - *Typically, E will use the* ``DW_OP_call_frame_cfa`` *operation or be a
> - stack pointer register plus or minus some offset.*
> -
> -4. ``DW_AT_data_member_location``
> -
> - For a ``DW_AT_data_member_location`` attribute there are two cases:
> -
> - 1. If the attribute is an integer constant B, it provides the offset in
> - bytes from the beginning of the containing entity.
> -
> - The result of the attribute is obtained by evaluating a
> - ``DW_OP_LLVM_offset B`` operation with an initial stack comprising the
> - location description of the beginning of the containing entity. The
> - result of the evaluation is the location description of the base of the
> - member entry.
> -
> - *If the beginning of the containing entity is not byte aligned, then the
> - beginning of the member entry has the same bit displacement within a
> - byte.*
> -
> - 2. Otherwise, the attribute must be a DWARF expression E which is evaluated
> - with a context of the current frame, current program location, and an
> - initial stack comprising the location description of the beginning of
> - the containing entity. The result of the evaluation is the location
> - description of the base of the member entry.
> -
> - .. note::
> -
> - The beginning of the containing entity can now be any location
> - description, including those with more than one single location
> - description, and those with single location descriptions that are of any
> - kind and have any bit offset.
> -
> -5. ``DW_AT_use_location``
> -
> - The ``DW_TAG_ptr_to_member_type`` debugging information entry has a
> - ``DW_AT_use_location`` attribute whose value is a DWARF expression E. It is
> - used to compute the location description of the member of the class to which
> - the pointer to member entry points.
> -
> - *The method used to find the location description of a given member of a
> - class, structure, or union is common to any instance of that class,
> - structure, or union and to any instance of the pointer to member type. The
> - method is thus associated with the pointer to member type, rather than with
> - each object that has a pointer to member type.*
> -
> - The ``DW_AT_use_location`` DWARF expression is used in conjunction with the
> - location description for a particular object of the given pointer to member
> - type and for a particular structure or class instance.
> -
> - The result of the attribute is obtained by evaluating E as a location
> - description with the context of the current subprogram, current program
> - location, and an initial stack comprising two entries. The first entry is
> - the value of the pointer to member object itself. The second entry is the
> - location description of the base of the entire class, structure, or union
> - instance containing the member whose location is being calculated.
> -
> -6. ``DW_AT_data_location``
> -
> - The ``DW_AT_data_location`` attribute may be used with any type that
> - provides one or more levels of hidden indirection and/or run-time parameters
> - in its representation. Its value is a DWARF operation expression E which
> - computes the location description of the data for an object. When this
> - attribute is omitted, the location description of the data is the same as
> - the location description of the object.
> -
> - The result of the attribute is obtained by evaluating E as a location
> - description with the context of the current subprogram, current program
> - location, and an empty initial stack.
> -
> - *E will typically involve an operation expression that begins with a*
> - ``DW_OP_push_object_address`` *operation which loads the location
> - description of the object which can then serve as a description in
> - subsequent calculation.*
> -
> - .. note::
> -
> - Since ``DW_AT_data_member_location``, ``DW_AT_use_location``, and
> - ``DW_AT_vtable_elem_location`` allow both operation expressions and
> - location list expressions, why does ``DW_AT_data_location`` not allow
> - both? In all cases they apply to data objects so less likely that
> - optimization would cause
> diff erent operation expressions for
> diff erent
> - program location ranges. But if supporting for some then should be for
> - all.
> -
> - It seems odd this attribute is not the same as
> - ``DW_AT_data_member_location`` in having an initial stack with the
> - location description of the object since the expression has to need it.
> -
> -7. ``DW_AT_vtable_elem_location``
> -
> - An entry for a virtual function also has a ``DW_AT_vtable_elem_location``
> - attribute whose value is a DWARF expression E.
> -
> - The result of the attribute is obtained by evaluating E as a location
> - description with the context of the current subprogram, current program
> - location, and an initial stack comprising the location description of the
> - object of the enclosing type.
> -
> - The resulting location description is the slot for the function within the
> - virtual function table for the enclosing class.
> -
> -8. ``DW_AT_static_link``
> -
> - If a ``DW_TAG_subprogram`` or ``DW_TAG_entry_point`` debugger information
> - entry is lexically nested, it may have a ``DW_AT_static_link`` attribute,
> - whose value is a DWARF expression E.
> -
> - The result of the attribute is obtained by evaluating E as a location
> - description with the context of the current subprogram, current program
> - location, and an empty initial stack.
> -
> - The DWARF is ill-formed if the resulting location description L is is not
> - comprised of one memory location description in any of the target
> - architecture specific address spaces.
> -
> - The resulting L is the *frame base* of the relevant instance of the
> - subprogram that immediately lexically encloses the subprogram or entry
> - point.
> -
> -9. ``DW_AT_return_addr``
> -
> - A ``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or
> - ``DW_TAG_entry_point`` debugger information entry may have a
> - ``DW_AT_return_addr`` attribute, whose value is a DWARF expression E.
> -
> - The result of the attribute is obtained by evaluating E as a location
> - description with the context of the current subprogram, current program
> - location, and an empty initial stack.
> -
> - The DWARF is ill-formed if the resulting location description L is not
> - comprised one memory location description in any of the target architecture
> - specific address spaces.
> -
> - The resulting L is the place where the return address for the subprogram or
> - entry point is stored.
> -
> - .. note::
> -
> - It is unclear why ``DW_TAG_inlined_subroutine`` has a
> - ``DW_AT_return_addr`` attribute but not a ``DW_AT_frame_base`` or
> - ``DW_AT_static_link`` attribute. Seems it would either have all of them or
> - none. Since inlined subprograms do not have a frame it seems they would
> - have none of these attributes.
> -
> -10. ``DW_AT_call_value``, ``DW_AT_call_data_location``, and ``DW_AT_call_data_value``
> -
> - A ``DW_TAG_call_site_parameter`` debugger information entry may have a
> - ``DW_AT_call_value`` attribute, whose value is a DWARF operation expression
> - E\ :sub:`1`\ .
> -
> - The result of the ``DW_AT_call_value`` attribute is obtained by evaluating
> - E\ :sub:`1` as a value with the context of the call site subprogram, call
> - site program location, and an empty initial stack.
> -
> - The call site subprogram is the subprogram containing the
> - ``DW_TAG_call_site_parameter`` debugger information entry. The call site
> - program location is the location of call site in the call site subprogram.
> -
> - *The consumer may have to virtually unwind to the call site in order to
> - evaluate the attribute. This will provide both the call site subprogram and
> - call site program location needed to evaluate the expression.*
> -
> - The resulting value V\ :sub:`1` is the value of the parameter at the time of
> - the call made by the call site.
> -
> - For parameters passed by reference, where the code passes a pointer to a
> - location which contains the parameter, or for reference type parameters, the
> - ``DW_TAG_call_site_parameter`` debugger information entry may also have a
> - ``DW_AT_call_data_location`` attribute whose value is a DWARF operation
> - expression E\ :sub:`2`\ , and a ``DW_AT_call_data_value`` attribute whose
> - value is a DWARF operation expression E\ :sub:`3`\ .
> -
> - The value of the ``DW_AT_call_data_location`` attribute is obtained by
> - evaluating E\ :sub:`2` as a location description with the context of the
> - call site subprogram, call site program location, and an empty initial
> - stack.
> -
> - The resulting location description L\ :sub:`2` is the location where the
> - referenced parameter lives during the call made by the call site. If E\
> - :sub:`2` would just be a ``DW_OP_push_object_address``, then the
> - ``DW_AT_call_data_location`` attribute may be omitted.
> -
> - The value of the ``DW_AT_call_data_value`` attribute is obtained by
> - evaluating E\ :sub:`3` as a value with the context of the call site
> - subprogram, call site program location, and an empty initial stack.
> -
> - The resulting value V\ :sub:`3` is the value in L\ :sub:`2` at the time of
> - the call made by the call site.
> -
> - If it is not possible to avoid the expressions of these attributes from
> - accessing registers or memory locations that might be clobbered by the
> - subprogram being called by the call site, then the associated attribute
> - should not be provided.
> -
> - *The reason for the restriction is that the parameter may need to be
> - accessed during the execution of the callee. The consumer may virtually
> - unwind from the called subprogram back to the caller and then evaluate the
> - attribute expressions. The call frame information (see*
> - :ref:`amdgpu-dwarf-call-frame-information`\ *) will not be able to restore
> - registers that have been clobbered, and clobbered memory will no longer have
> - the value at the time of the call.*
> -
> -11. ``DW_AT_LLVM_lanes`` *New*
> -
> - For languages that are implemented using a SIMD or SIMT execution model, a
> - ``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or
> - ``DW_TAG_entry_point`` debugger information entry may have a
> - ``DW_AT_LLVM_lanes`` attribute whose value is an integer constant that is
> - the number of lanes per thread. This is the static number of lanes per
> - thread. It is not the dynamic number of lanes with which the thread was
> - initiated, for example, due to smaller or partial work-groups.
> -
> - If not present, the default value of 1 is used.
> -
> - The DWARF is ill-formed if the value is 0.
> -
> -12. ``DW_AT_LLVM_lane_pc`` *New*
> -
> - For languages that are implemented using a SIMD or SIMT execution model, a
> - ``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or
> - ``DW_TAG_entry_point`` debugging information entry may have a
> - ``DW_AT_LLVM_lane_pc`` attribute whose value is a DWARF expression E.
> -
> - The result of the attribute is obtained by evaluating E as a location
> - description with the context of the current subprogram, current program
> - location, and an empty initial stack.
> -
> - The resulting location description L is for a thread lane count sized vector
> - of generic type elements. The thread lane count is the value of the
> - ``DW_AT_LLVM_lanes`` attribute. Each element holds the conceptual program
> - location of the corresponding lane, where the least significant element
> - corresponds to the first target architecture specific lane identifier and so
> - forth. If the lane was not active when the current subprogram was called,
> - its element is an undefined location description.
> -
> - ``DW_AT_LLVM_lane_pc`` *allows the compiler to indicate conceptually where
> - each lane of a SIMT thread is positioned even when it is in divergent
> - control flow that is not active.*
> -
> - *Typically, the result is a location description with one composite location
> - description with each part being a location description with either one
> - undefined location description or one memory location description.*
> -
> - If not present, the thread is not being used in a SIMT manner, and the
> - thread's current program location is used.
> -
> -13. ``DW_AT_LLVM_active_lane`` *New*
> -
> - For languages that are implemented using a SIMD or SIMT execution model, a
> - ``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or
> - ``DW_TAG_entry_point`` debugger information entry may have a
> - ``DW_AT_LLVM_active_lane`` attribute whose value is a DWARF expression E.
> -
> - The result of the attribute is obtained by evaluating E as a value with the
> - context of the current subprogram, current program location, and an empty
> - initial stack.
> -
> - The DWARF is ill-formed if the resulting value V is not an integral value.
> -
> - The resulting V is a bit mask of active lanes for the current program
> - location. The N\ :sup:`th` least significant bit of the mask corresponds to
> - the N\ :sup:`th` lane. If the bit is 1 the lane is active, otherwise it is
> - inactive.
> -
> - *Some targets may update the target architecture execution mask for regions
> - of code that must execute with
> diff erent sets of lanes than the current
> - active lanes. For example, some code must execute with all lanes made
> - temporarily active.* ``DW_AT_LLVM_active_lane`` *allows the compiler to
> - provide the means to determine the source language active lanes.*
> -
> - If not present and ``DW_AT_LLVM_lanes`` is greater than 1, then the target
> - architecture execution mask is used.
> -
> -14. ``DW_AT_LLVM_vector_size`` *New*
> -
> - A ``DW_TAG_base_type`` debugger information entry for a base type T may have
> - a ``DW_AT_LLVM_vector_size`` attribute whose value is an integer constant
> - that is the vector type size N.
> -
> - The representation of a vector base type is as N contiguous elements, each
> - one having the representation of a base type T' that is the same as T
> - without the ``DW_AT_LLVM_vector_size`` attribute.
> -
> - If a ``DW_TAG_base_type`` debugger information entry does not have a
> - ``DW_AT_LLVM_vector_size`` attribute, then the base type is not a vector
> - type.
> -
> - The DWARF is ill-formed if N is not greater than 0.
> -
> - .. note::
> -
> - LLVM has mention of a non-upstreamed debugger information entry that is
> - intended to support vector types. However, that was not for a base type so
> - would not be suitable as the type of a stack value entry. But perhaps that
> - could be replaced by using this attribute.
> -
> -15. ``DW_AT_LLVM_augmentation`` *New*
> -
> - A ``DW_TAG_compile_unit`` debugger information entry for a compilation unit
> - may have a ``DW_AT_LLVM_augmentation`` attribute, whose value is an
> - augmentation string.
> -
> - *The augmentation string allows producers to indicate that there is
> - additional vendor or target specific information in the debugging
> - information entries. For example, this might be information about the
> - version of vendor specific extensions that are being used.*
> -
> - If not present, or if the string is empty, then the compilation unit has no
> - augmentation string.
> -
> - The format for the augmentation string is:
> -
> - | ``[``\ *vendor*\ ``v``\ *X*\ ``.``\ *Y*\ [\ ``:``\ *options*\ ]\ ``]``\ *
> -
> - Where *vendor* is the producer, ``vX.Y`` specifies the major X and minor Y
> - version number of the extensions used, and *options* is an optional string
> - providing additional information about the extensions. The version number
> - must conform to [SEMVER]_. The *options* string must not contain the "\
> - ``]``\ " character.
> -
> - For example:
> -
> - ::
> -
> - [abc:v0.0][def:v1.2:feature-a=on,feature-b=3]
> -
> -Program Scope Entities
> -----------------------
> -
> -.. _amdgpu-dwarf-language-names:
> -
> -Unit Entities
> -~~~~~~~~~~~~~
> -
> -.. note::
> -
> - This augments DWARF Version 5 section 3.1.1 and Table 3.1.
> -
> -Additional language codes defined for use with the ``DW_AT_language`` attribute
> -are defined in :ref:`amdgpu-dwarf-language-names-table`.
> -
> -.. table:: Language Names
> - :name: amdgpu-dwarf-language-names-table
> -
> - ==================== =============================
> - Language Name Meaning
> - ==================== =============================
> - ``DW_LANG_LLVM_HIP`` HIP Language.
> - ==================== =============================
> -
> -The ``DW_LANG_LLVM_HIP`` language can be supported by extending the C++
> -language. See [HIP]_.
> -
> -Other Debugger Information
> ---------------------------
> -
> -Accelerated Access
> -~~~~~~~~~~~~~~~~~~
> -
> -.. _amdgpu-dwarf-lookup-by-name:
> -
> -Lookup By Name
> -++++++++++++++
> -
> -Contents of the Name Index
> -##########################
> -
> -.. note::
> -
> - The following provides changes to DWARF Version 5 section 6.1.1.1.
> -
> - The rule for debugger information entries included in the name index in the
> - optional ``.debug_names`` section is extended to also include named
> - ``DW_TAG_variable`` debugging information entries with a ``DW_AT_location``
> - attribute that includes a ``DW_OP_LLVM_form_aspace_address`` operation.
> -
> -The name index must contain an entry for each debugging information entry that
> -defines a named subprogram, label, variable, type, or namespace, subject to the
> -following rules:
> -
> -* ``DW_TAG_variable`` debugging information entries with a ``DW_AT_location``
> - attribute that includes a ``DW_OP_addr``, ``DW_OP_LLVM_form_aspace_address``,
> - or ``DW_OP_form_tls_address`` operation are included; otherwise, they are
> - excluded.
> -
> -Data Representation of the Name Index
> -#####################################
> -
> -Section Header
> -^^^^^^^^^^^^^^
> -
> -.. note::
> -
> - The following provides an addition to DWARF Version 5 section 6.1.1.4.1 item
> - 14 ``augmentation_string``.
> -
> -A null-terminated UTF-8 vendor specific augmentation string, which provides
> -additional information about the contents of this index. If provided, the
> -recommended format for augmentation string is:
> -
> - | ``[``\ *vendor*\ ``v``\ *X*\ ``.``\ *Y*\ [\ ``:``\ *options*\ ]\ ``]``\ *
> -
> -Where *vendor* is the producer, ``vX.Y`` specifies the major X and minor Y
> -version number of the extensions used in the DWARF of the compilation unit, and
> -*options* is an optional string providing additional information about the
> -extensions. The version number must conform to [SEMVER]_. The *options* string
> -must not contain the "\ ``]``\ " character.
> -
> -For example:
> -
> - ::
> -
> - [abc:v0.0][def:v1.2:feature-a=on,feature-b=3]
> -
> -.. note::
> -
> - This is
> diff erent to the definition in DWARF Version 5 but is consistent with
> - the other augmentation strings and allows multiple vendor extensions to be
> - supported.
> -
> -.. _amdgpu-dwarf-line-number-information:
> -
> -Line Number Information
> -~~~~~~~~~~~~~~~~~~~~~~~
> -
> -The Line Number Program Header
> -++++++++++++++++++++++++++++++
> -
> -Standard Content Descriptions
> -#############################
> -
> -.. note::
> -
> - This augments DWARF Version 5 section 6.2.4.1.
> -
> -.. _amdgpu-dwarf-line-number-information-dw-lnct-llvm-source:
> -
> -1. ``DW_LNCT_LLVM_source``
> -
> - The component is a null-terminated UTF-8 source text string with "\ ``\n``\
> - " line endings. This content code is paired with the same forms as
> - ``DW_LNCT_path``. It can be used for file name entries.
> -
> - The value is an empty null-terminated string if no source is available. If
> - the source is available but is an empty file then the value is a
> - null-terminated single "\ ``\n``\ ".
> -
> - *When the source field is present, consumers can use the embedded source
> - instead of attempting to discover the source on disk using the file path
> - provided by the* ``DW_LNCT_path`` *field. When the source field is absent,
> - consumers can access the file to get the source text.*
> -
> - *This is particularly useful for programing languages that support runtime
> - compilation and runtime generation of source text. In these cases, the
> - source text does not reside in any permanent file. For example, the OpenCL
> - language supports online compilation.*
> -
> -2. ``DW_LNCT_LLVM_is_MD5``
> -
> - ``DW_LNCT_LLVM_is_MD5`` indicates if the ``DW_LNCT_MD5`` content kind, if
> - present, is valid: when 0 it is not valid and when 1 it is valid. If
> - ``DW_LNCT_LLVM_is_MD5`` content kind is not present, and ``DW_LNCT_MD5``
> - content kind is present, then the MD5 checksum is valid.
> -
> - ``DW_LNCT_LLVM_is_MD5`` is always paired with the ``DW_FORM_udata`` form.
> -
> - *This allows a compilation unit to have a mixture of files with and without
> - MD5 checksums. This can happen when multiple relocatable files are linked
> - together.*
> -
> -.. _amdgpu-dwarf-call-frame-information:
> -
> -Call Frame Information
> -~~~~~~~~~~~~~~~~~~~~~~
> -
> -.. note::
> -
> - This section provides changes to existing Call Frame Information and defines
> - instructions added by the proposal. Additional support is added for address
> - spaces. Register unwind DWARF expressions are generalized to allow any
> - location description, including those with composite and implicit location
> - descriptions.
> -
> - These changes would be incorporated into the DWARF Version 5 section 6.1.
> -
> -Structure of Call Frame Information
> -+++++++++++++++++++++++++++++++++++
> -
> -The register rules are:
> -
> -*undefined*
> - A register that has this rule has no recoverable value in the previous frame.
> - (By convention, it is not preserved by a callee.)
> -
> -*same value*
> - This register has not been modified from the previous frame. (By convention,
> - it is preserved by the callee, but the callee has not modified it.)
> -
> -*offset(N)*
> - N is a signed byte offset. The previous value of this register is saved at the
> - location description computed as if the DWARF operation expression
> - ``DW_OP_LLVM_offset N`` is evaluated as a location description with an initial
> - stack comprising the location description of the current CFA (see
> - :ref:`amdgpu-dwarf-operation-expressions`).
> -
> -*val_offset(N)*
> - N is a signed byte offset. The previous value of this register is the memory
> - byte address of the location description computed as if the DWARF operation
> - expression ``DW_OP_LLVM_offset N`` is evaluated as a location description with
> - an initial stack comprising the location description of the current CFA (see
> - :ref:`amdgpu-dwarf-operation-expressions`).
> -
> - The DWARF is ill-formed if the CFA location description is not a memory byte
> - address location description, or if the register size does not match the size
> - of an address in the address space of the current CFA location description.
> -
> - *Since the CFA location description is required to be a memory byte address
> - location description, the value of val_offset(N) will also be a memory byte
> - address location description since it is offsetting the CFA location
> - description by N bytes. Furthermore, the value of val_offset(N) will be a
> - memory byte address in the same address space as the CFA location
> - description.*
> -
> - .. note::
> -
> - Should DWARF allow the address size to be a
> diff erent size to the size of
> - the register? Requiring them to be the same bit size avoids any issue of
> - conversion as the bit contents of the register is simply interpreted as a
> - value of the address.
> -
> - Gdb has a per register hook that allows a target specific conversion on a
> - register by register basis. It defaults to truncation of bigger registers,
> - and to actually reading bytes from the next register (or reads out of bounds
> - for the last register) for smaller registers. There are no gdb tests that
> - read a register out of bounds (except an illegal hand written assembly
> - test).
> -
> -*register(R)*
> - The previous value of this register is stored in another register numbered R.
> -
> - The DWARF is ill-formed if the register sizes do not match.
> -
> -*expression(E)*
> - The previous value of this register is located at the location description
> - produced by evaluating the DWARF operation expression E (see
> - :ref:`amdgpu-dwarf-operation-expressions`).
> -
> - E is evaluated as a location description in the context of the current
> - subprogram, current program location, and with an initial stack comprising the
> - location description of the current CFA.
> -
> -*val_expression(E)*
> - The previous value of this register is the value produced by evaluating the
> - DWARF operation expression E (see :ref:`amdgpu-dwarf-operation-expressions`).
> -
> - E is evaluated as a value in the context of the current subprogram, current
> - program location, and with an initial stack comprising the location
> - description of the current CFA.
> -
> - The DWARF is ill-formed if the resulting value type size does not match the
> - register size.
> -
> - .. note::
> -
> - This has limited usefulness as the DWARF expression E can only produce
> - values up to the size of the generic type. This is due to not allowing any
> - operations that specify a type in a CFI operation expression. This makes it
> - unusable for registers that are larger than the generic type. However,
> - *expression(E)* can be used to create an implicit location description of
> - any size.
> -
> -*architectural*
> - The rule is defined externally to this specification by the augmenter.
> -
> -A Common Information Entry holds information that is shared among many Frame
> -Description Entries. There is at least one CIE in every non-empty
> -``.debug_frame`` section. A CIE contains the following fields, in order:
> -
> -1. ``length`` (initial length)
> -
> - A constant that gives the number of bytes of the CIE structure, not
> - including the length field itself. The size of the length field plus the
> - value of length must be an integral multiple of the address size specified
> - in the ``address_size`` field.
> -
> -2. ``CIE_id`` (4 or 8 bytes, see
> - :ref:`amdgpu-dwarf-32-bit-and-64-bit-dwarf-formats`)
> -
> - A constant that is used to distinguish CIEs from FDEs.
> -
> - In the 32-bit DWARF format, the value of the CIE id in the CIE header is
> - 0xffffffff; in the 64-bit DWARF format, the value is 0xffffffffffffffff.
> -
> -3. ``version`` (ubyte)
> -
> - A version number. This number is specific to the call frame information and
> - is independent of the DWARF version number.
> -
> - The value of the CIE version number is 4.
> -
> - .. note::
> -
> - Would this be increased to 5 to reflect the changes in the proposal?
> -
> -4. ``augmentation`` (sequence of UTF-8 characters)
> -
> - A null-terminated UTF-8 string that identifies the augmentation to this CIE
> - or to the FDEs that use it. If a reader encounters an augmentation string
> - that is unexpected, then only the following fields can be read:
> -
> - * CIE: length, CIE_id, version, augmentation
> - * FDE: length, CIE_pointer, initial_location, address_range
> -
> - If there is no augmentation, this value is a zero byte.
> -
> - *The augmentation string allows users to indicate that there is additional
> - vendor and target architecture specific information in the CIE or FDE which
> - is needed to virtually unwind a stack frame. For example, this might be
> - information about dynamically allocated data which needs to be freed on exit
> - from the routine.*
> -
> - *Because the* ``.debug_frame`` *section is useful independently of any*
> - ``.debug_info`` *section, the augmentation string always uses UTF-8
> - encoding.*
> -
> - The recommended format for the augmentation string is:
> -
> - | ``[``\ *vendor*\ ``v``\ *X*\ ``.``\ *Y*\ [\ ``:``\ *options*\ ]\ ``]``\ *
> -
> - Where *vendor* is the producer, ``vX.Y`` specifies the major X and minor Y
> - version number of the extensions used, and *options* is an optional string
> - providing additional information about the extensions. The version number
> - must conform to [SEMVER]_. The *options* string must not contain the "\
> - ``]``\ " character.
> -
> - For example:
> -
> - ::
> -
> - [abc:v0.0][def:v1.2:feature-a=on,feature-b=3]
> -
> -5. ``address_size`` (ubyte)
> -
> - The size of a target address in this CIE and any FDEs that use it, in bytes.
> - If a compilation unit exists for this frame, its address size must match the
> - address size here.
> -
> -6. ``segment_selector_size`` (ubyte)
> -
> - The size of a segment selector in this CIE and any FDEs that use it, in
> - bytes.
> -
> -7. ``code_alignment_factor`` (unsigned LEB128)
> -
> - A constant that is factored out of all advance location instructions (see
> - :ref:`amdgpu-dwarf-row-creation-instructions`). The resulting value is
> - ``(operand * code_alignment_factor)``.
> -
> -8. ``data_alignment_factor`` (signed LEB128)
> -
> - A constant that is factored out of certain offset instructions (see
> - :ref:`amdgpu-dwarf-cfa-definition-instructions` and
> - :ref:`amdgpu-dwarf-register-rule-instructions`). The resulting value is
> - ``(operand * data_alignment_factor)``.
> -
> -9. ``return_address_register`` (unsigned LEB128)
> -
> - An unsigned LEB128 constant that indicates which column in the rule table
> - represents the return address of the subprogram. Note that this column might
> - not correspond to an actual machine register.
> -
> -10. ``initial_instructions`` (array of ubyte)
> -
> - A sequence of rules that are interpreted to create the initial setting of
> - each column in the table.
> -
> - The default rule for all columns before interpretation of the initial
> - instructions is the undefined rule. However, an ABI authoring body or a
> - compilation system authoring body may specify an alternate default value for
> - any or all columns.
> -
> -11. ``padding`` (array of ubyte)
> -
> - Enough ``DW_CFA_nop`` instructions to make the size of this entry match the
> - length value above.
> -
> -An FDE contains the following fields, in order:
> -
> -1. ``length`` (initial length)
> -
> - A constant that gives the number of bytes of the header and instruction
> - stream for this subprogram, not including the length field itself. The size
> - of the length field plus the value of length must be an integral multiple of
> - the address size.
> -
> -2. ``CIE_pointer`` (4 or 8 bytes, see
> - :ref:`amdgpu-dwarf-32-bit-and-64-bit-dwarf-formats`)
> -
> - A constant offset into the ``.debug_frame`` section that denotes the CIE
> - that is associated with this FDE.
> -
> -3. ``initial_location`` (segment selector and target address)
> -
> - The address of the first location associated with this table entry. If the
> - segment_selector_size field of this FDE’s CIE is non-zero, the initial
> - location is preceded by a segment selector of the given length.
> -
> -4. ``address_range`` (target address)
> -
> - The number of bytes of program instructions described by this entry.
> -
> -5. ``instructions`` (array of ubyte)
> -
> - A sequence of table defining instructions that are described in
> - :ref:`amdgpu-dwarf-call-frame-instructions`.
> -
> -6. ``padding`` (array of ubyte)
> -
> - Enough ``DW_CFA_nop`` instructions to make the size of this entry match the
> - length value above.
> -
> -.. _amdgpu-dwarf-call-frame-instructions:
> -
> -Call Frame Instructions
> -+++++++++++++++++++++++
> -
> -Some call frame instructions have operands that are encoded as DWARF operation
> -expressions E (see :ref:`amdgpu-dwarf-operation-expressions`). The DWARF
> -operations that can be used in E have the following restrictions:
> -
> -* ``DW_OP_addrx``, ``DW_OP_call2``, ``DW_OP_call4``, ``DW_OP_call_ref``,
> - ``DW_OP_const_type``, ``DW_OP_constx``, ``DW_OP_convert``,
> - ``DW_OP_deref_type``, ``DW_OP_fbreg``, ``DW_OP_implicit_pointer``,
> - ``DW_OP_regval_type``, ``DW_OP_reinterpret``, and ``DW_OP_xderef_type``
> - operations are not allowed because the call frame information must not depend
> - on other debug sections.
> -
> -* ``DW_OP_push_object_address`` is not allowed because there is no object
> - context to provide a value to push.
> -
> -* ``DW_OP_LLVM_push_lane`` is not allowed because the call frame instructions
> - describe the actions for the whole thread, not the lanes independently.
> -
> -* ``DW_OP_call_frame_cfa`` and ``DW_OP_entry_value`` are not allowed because
> - their use would be circular.
> -
> -* ``DW_OP_LLVM_call_frame_entry_reg`` is not allowed if evaluating E causes a
> - circular dependency between ``DW_OP_LLVM_call_frame_entry_reg`` operations.
> -
> - *For example, if a register R1 has a* ``DW_CFA_def_cfa_expression``
> - *instruction that evaluates a* ``DW_OP_LLVM_call_frame_entry_reg`` *operation
> - that specifies register R2, and register R2 has a*
> - ``DW_CFA_def_cfa_expression`` *instruction that that evaluates a*
> - ``DW_OP_LLVM_call_frame_entry_reg`` *operation that specifies register R1.*
> -
> -*Call frame instructions to which these restrictions apply include*
> -``DW_CFA_def_cfa_expression``\ *,* ``DW_CFA_expression``\ *, and*
> -``DW_CFA_val_expression``\ *.*
> -
> -.. _amdgpu-dwarf-row-creation-instructions:
> -
> -Row Creation Instructions
> -#########################
> -
> -.. note::
> -
> - These instructions are the same as in DWARF Version 5 section 6.4.2.1.
> -
> -.. _amdgpu-dwarf-cfa-definition-instructions:
> -
> -CFA Definition Instructions
> -###########################
> -
> -1. ``DW_CFA_def_cfa``
> -
> - The ``DW_CFA_def_cfa`` instruction takes two unsigned LEB128 operands
> - representing a register number R and a (non-factored) byte displacement B.
> - AS is set to the target architecture default address space identifier. The
> - required action is to define the current CFA rule to be the result of
> - evaluating the DWARF operation expression ``DW_OP_constu AS;
> - DW_OP_aspace_bregx R, B`` as a location description.
> -
> -2. ``DW_CFA_def_cfa_sf``
> -
> - The ``DW_CFA_def_cfa_sf`` instruction takes two operands: an unsigned LEB128
> - value representing a register number R and a signed LEB128 factored byte
> - displacement B. AS is set to the target architecture default address space
> - identifier. The required action is to define the current CFA rule to be the
> - result of evaluating the DWARF operation expression ``DW_OP_constu AS;
> - DW_OP_aspace_bregx R, B*data_alignment_factor`` as a location description.
> -
> - *The action is the same as* ``DW_CFA_def_cfa`` *except that the second
> - operand is signed and factored.*
> -
> -3. ``DW_CFA_def_aspace_cfa`` *New*
> -
> - The ``DW_CFA_def_aspace_cfa`` instruction takes three unsigned LEB128
> - operands representing a register number R, a (non-factored) byte
> - displacement B, and a target architecture specific address space identifier
> - AS. The required action is to define the current CFA rule to be the result
> - of evaluating the DWARF operation expression ``DW_OP_constu AS;
> - DW_OP_aspace_bregx R, B`` as a location description.
> -
> - If AS is not one of the values defined by the target architecture specific
> - ``DW_ASPACE_*`` values then the DWARF expression is ill-formed.
> -
> -4. ``DW_CFA_def_aspace_cfa_sf`` *New*
> -
> - The ``DW_CFA_def_cfa_sf`` instruction takes three operands: an unsigned
> - LEB128 value representing a register number R, a signed LEB128 factored byte
> - displacement B, and an unsigned LEB128 value representing a target
> - architecture specific address space identifier AS. The required action is to
> - define the current CFA rule to be the result of evaluating the DWARF
> - operation expression ``DW_OP_constu AS; DW_OP_aspace_bregx R,
> - B*data_alignment_factor`` as a location description.
> -
> - If AS is not one of the values defined by the target architecture specific
> - ``DW_ASPACE_*`` values, then the DWARF expression is ill-formed.
> -
> - *The action is the same as* ``DW_CFA_aspace_def_cfa`` *except that the
> - second operand is signed and factored.*
> -
> -5. ``DW_CFA_def_cfa_register``
> -
> - The ``DW_CFA_def_cfa_register`` instruction takes a single unsigned LEB128
> - operand representing a register number R. The required action is to define
> - the current CFA rule to be the result of evaluating the DWARF operation
> - expression ``DW_OP_constu AS; DW_OP_aspace_bregx R, B`` as a location
> - description. B and AS are the old CFA byte displacement and address space
> - respectively.
> -
> - If the subprogram has no current CFA rule, or the rule was defined by a
> - ``DW_CFA_def_cfa_expression`` instruction, then the DWARF is ill-formed.
> -
> -6. ``DW_CFA_def_cfa_offset``
> -
> - The ``DW_CFA_def_cfa_offset`` instruction takes a single unsigned LEB128
> - operand representing a (non-factored) byte displacement B. The required
> - action is to define the current CFA rule to be the result of evaluating the
> - DWARF operation expression ``DW_OP_constu AS; DW_OP_aspace_bregx R, B`` as a
> - location description. R and AS are the old CFA register number and address
> - space respectively.
> -
> - If the subprogram has no current CFA rule, or the rule was defined by a
> - ``DW_CFA_def_cfa_expression`` instruction, then the DWARF is ill-formed.
> -
> -7. ``DW_CFA_def_cfa_offset_sf``
> -
> - The ``DW_CFA_def_cfa_offset_sf`` instruction takes a signed LEB128 operand
> - representing a factored byte displacement B. The required action is to
> - define the current CFA rule to be the result of evaluating the DWARF
> - operation expression ``DW_OP_constu AS; DW_OP_aspace_bregx R,
> - B*data_alignment_factor`` as a location description. R and AS are the old
> - CFA register number and address space respectively.
> -
> - If the subprogram has no current CFA rule, or the rule was defined by a
> - ``DW_CFA_def_cfa_expression`` instruction, then the DWARF is ill-formed.
> -
> - *The action is the same as* ``DW_CFA_def_cfa_offset`` *except that the
> - operand is signed and factored.*
> -
> -8. ``DW_CFA_def_cfa_expression``
> -
> - The ``DW_CFA_def_cfa_expression`` instruction takes a single operand encoded
> - as a ``DW_FORM_exprloc`` value representing a DWARF operation expression E.
> - The required action is to define the current CFA rule to be the result of
> - evaluating E as a location description in the context of the current
> - subprogram, current program location, and an empty initial stack.
> -
> - *See* :ref:`amdgpu-dwarf-call-frame-instructions` *regarding restrictions on
> - the DWARF expression operations that can be used in E.*
> -
> - The DWARF is ill-formed if the result of evaluating E is not a memory byte
> - address location description.
> -
> -.. _amdgpu-dwarf-register-rule-instructions:
> -
> -Register Rule Instructions
> -##########################
> -
> -1. ``DW_CFA_undefined``
> -
> - The ``DW_CFA_undefined`` instruction takes a single unsigned LEB128 operand
> - that represents a register number R. The required action is to set the rule
> - for the register specified by R to ``undefined``.
> -
> -2. ``DW_CFA_same_value``
> -
> - The ``DW_CFA_same_value`` instruction takes a single unsigned LEB128 operand
> - that represents a register number R. The required action is to set the rule
> - for the register specified by R to ``same value``.
> -
> -3. ``DW_CFA_offset``
> -
> - The ``DW_CFA_offset`` instruction takes two operands: a register number R
> - (encoded with the opcode) and an unsigned LEB128 constant representing a
> - factored displacement B. The required action is to change the rule for the
> - register specified by R to be an *offset(B\*data_alignment_factor)* rule.
> -
> - .. note::
> -
> - Seems this should be named ``DW_CFA_offset_uf`` since the offset is
> - unsigned factored.
> -
> -4. ``DW_CFA_offset_extended``
> -
> - The ``DW_CFA_offset_extended`` instruction takes two unsigned LEB128
> - operands representing a register number R and a factored displacement B.
> - This instruction is identical to ``DW_CFA_offset`` except for the encoding
> - and size of the register operand.
> -
> - .. note::
> -
> - Seems this should be named ``DW_CFA_offset_extended_uf`` since the
> - displacement is unsigned factored.
> -
> -5. ``DW_CFA_offset_extended_sf``
> -
> - The ``DW_CFA_offset_extended_sf`` instruction takes two operands: an
> - unsigned LEB128 value representing a register number R and a signed LEB128
> - factored displacement B. This instruction is identical to
> - ``DW_CFA_offset_extended`` except that B is signed.
> -
> -6. ``DW_CFA_val_offset``
> -
> - The ``DW_CFA_val_offset`` instruction takes two unsigned LEB128 operands
> - representing a register number R and a factored displacement B. The required
> - action is to change the rule for the register indicated by R to be a
> - *val_offset(B\*data_alignment_factor)* rule.
> -
> - .. note::
> -
> - Seems this should be named ``DW_CFA_val_offset_uf`` since the displacement
> - is unsigned factored.
> -
> - .. note::
> -
> - An alternative is to define ``DW_CFA_val_offset`` to implicitly use the
> - target architecture default address space, and add another operation that
> - specifies the address space.
> -
> -7. ``DW_CFA_val_offset_sf``
> -
> - The ``DW_CFA_val_offset_sf`` instruction takes two operands: an unsigned
> - LEB128 value representing a register number R and a signed LEB128 factored
> - displacement B. This instruction is identical to ``DW_CFA_val_offset``
> - except that B is signed.
> -
> -8. ``DW_CFA_register``
> -
> - The ``DW_CFA_register`` instruction takes two unsigned LEB128 operands
> - representing register numbers R1 and R2 respectively. The required action is
> - to set the rule for the register specified by R1 to be a *register(R2)* rule.
> -
> -9. ``DW_CFA_expression``
> -
> - The ``DW_CFA_expression`` instruction takes two operands: an unsigned LEB128
> - value representing a register number R, and a ``DW_FORM_block`` value
> - representing a DWARF operation expression E. The required action is to
> - change the rule for the register specified by R to be an *expression(E)*
> - rule.
> -
> - *That is, E computes the location description where the register value can
> - be retrieved.*
> -
> - *See* :ref:`amdgpu-dwarf-call-frame-instructions` *regarding restrictions on
> - the DWARF expression operations that can be used in E.*
> -
> -10. ``DW_CFA_val_expression``
> -
> - The ``DW_CFA_val_expression`` instruction takes two operands: an unsigned
> - LEB128 value representing a register number R, and a ``DW_FORM_block`` value
> - representing a DWARF operation expression E. The required action is to
> - change the rule for the register specified by R to be a *val_expression(E)*
> - rule.
> -
> - *That is, E computes the value of register R.*
> -
> - *See* :ref:`amdgpu-dwarf-call-frame-instructions` *regarding restrictions on
> - the DWARF expression operations that can be used in E.*
> -
> - If the result of evaluating E is not a value with a base type size that
> - matches the register size, then the DWARF is ill-formed.
> -
> -11. ``DW_CFA_restore``
> -
> - The ``DW_CFA_restore`` instruction takes a single operand (encoded with the
> - opcode) that represents a register number R. The required action is to
> - change the rule for the register specified by R to the rule assigned it by
> - the ``initial_instructions`` in the CIE.
> -
> -12. ``DW_CFA_restore_extended``
> -
> - The ``DW_CFA_restore_extended`` instruction takes a single unsigned LEB128
> - operand that represents a register number R. This instruction is identical
> - to ``DW_CFA_restore`` except for the encoding and size of the register
> - operand.
> + information on the DWARF produced by the AMDGPU backend.
>
> -Row State Instructions
> -######################
> +``.dynamic``, ``.dynstr``, ``.dynsym``, ``.hash``
> + The standard sections used by a dynamic loader.
>
> -.. note::
> +``.note``
> + See :ref:`amdgpu-note-records` for the note records supported by the AMDGPU
> + backend.
>
> - These instructions are the same as in DWARF Version 5 section 6.4.2.4.
> +``.rela``\ *name*, ``.rela.dyn``
> + For relocatable code objects, *name* is the name of the section that the
> + relocation records apply. For example, ``.rela.text`` is the section name for
> + relocation records associated with the ``.text`` section.
>
> -Padding Instruction
> -###################
> + For linked shared code objects, ``.rela.dyn`` contains all the relocation
> + records from each of the relocatable code object's ``.rela``\ *name* sections.
>
> -.. note::
> + See :ref:`amdgpu-relocation-records` for the relocation records supported by
> + the AMDGPU backend.
>
> - These instructions are the same as in DWARF Version 5 section 6.4.2.5.
> +``.text``
> + The executable machine code for the kernels and functions they call. Generated
> + as position independent code. See :ref:`amdgpu-code-conventions` for
> + information on conventions used in the isa generation.
>
> -Call Frame Instruction Usage
> -++++++++++++++++++++++++++++
> +.. _amdgpu-note-records:
>
> -.. note::
> +Note Records
> +------------
>
> - The same as in DWARF Version 5 section 6.4.3.
> +The AMDGPU backend code object contains ELF note records in the ``.note``
> +section. The set of generated notes and their semantics depend on the code
> +object version; see :ref:`amdgpu-note-records-v2` and
> +:ref:`amdgpu-note-records-v3`.
>
> -.. _amdgpu-dwarf-call-frame-calling-address:
> +As required by ``ELFCLASS32`` and ``ELFCLASS64``, minimal zero-byte padding
> +must be generated after the ``name`` field to ensure the ``desc`` field is 4
> +byte aligned. In addition, minimal zero-byte padding must be generated to
> +ensure the ``desc`` field size is a multiple of 4 bytes. The ``sh_addralign``
> +field of the ``.note`` section must be at least 4 to indicate at least 8 byte
> +alignment.
>
> -Call Frame Calling Address
> -++++++++++++++++++++++++++
> +.. _amdgpu-note-records-v2:
>
> -.. note::
> +Code Object V2 Note Records (-mattr=-code-object-v3)
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> - The same as in DWARF Version 5 section 6.4.4.
> +.. warning:: Code Object V2 is not the default code object version emitted by
> + this version of LLVM. For a description of the notes generated with the
> + default configuration (Code Object V3) see :ref:`amdgpu-note-records-v3`.
>
> -Data Representation
> --------------------
> +The AMDGPU backend code object uses the following ELF note record in the
> +``.note`` section when compiling for Code Object V2 (-mattr=-code-object-v3).
>
> -.. _amdgpu-dwarf-32-bit-and-64-bit-dwarf-formats:
> +Additional note records may be present, but any which are not documented here
> +are deprecated and should not be used.
>
> -32-Bit and 64-Bit DWARF Formats
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> + .. table:: AMDGPU Code Object V2 ELF Note Records
> + :name: amdgpu-elf-note-records-table-v2
>
> -.. note::
> + ===== ============================== ======================================
> + Name Type Description
> + ===== ============================== ======================================
> + "AMD" ``NT_AMD_AMDGPU_HSA_METADATA`` <metadata null terminated string>
> + ===== ============================== ======================================
>
> - This augments DWARF Version 5 section 7.4.
> -
> -1. Within the body of the ``.debug_info`` section, certain forms of attribute
> - value depend on the choice of DWARF format as follows. For the 32-bit DWARF
> - format, the value is a 4-byte unsigned integer; for the 64-bit DWARF format,
> - the value is an 8-byte unsigned integer.
> -
> - .. table:: ``.debug_info`` section attribute form roles
> - :name: amdgpu-dwarf-debug-info-section-attribute-form-roles-table
> -
> - ================================== ===================================
> - Form Role
> - ================================== ===================================
> - DW_FORM_line_strp offset in ``.debug_line_str``
> - DW_FORM_ref_addr offset in ``.debug_info``
> - DW_FORM_sec_offset offset in a section other than
> - ``.debug_info`` or ``.debug_str``
> - DW_FORM_strp offset in ``.debug_str``
> - DW_FORM_strp_sup offset in ``.debug_str`` section of
> - supplementary object file
> - DW_OP_call_ref offset in ``.debug_info``
> - DW_OP_implicit_pointer offset in ``.debug_info``
> - DW_OP_LLVM_aspace_implicit_pointer offset in ``.debug_info``
> - ================================== ===================================
> -
> -Format of Debugging Information
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> -
> -Attribute Encodings
> -+++++++++++++++++++
> +..
>
> -.. note::
> + .. table:: AMDGPU Code Object V2 ELF Note Record Enumeration Values
> + :name: amdgpu-elf-note-record-enumeration-values-table-v2
>
> - This augments DWARF Version 5 section 7.5.4 and Table 7.5.
> + ============================== =====
> + Name Value
> + ============================== =====
> + *reserved* 0-9
> + ``NT_AMD_AMDGPU_HSA_METADATA`` 10
> + *reserved* 11
> + ============================== =====
>
> -The following table gives the encoding of the additional debugging information
> -entry attributes.
> +``NT_AMD_AMDGPU_HSA_METADATA``
> + Specifies extensible metadata associated with the code objects executed on HSA
> + [HSA]_ compatible runtimes such as AMD's ROCm [AMD-ROCm]_. It is required when
> + the target triple OS is ``amdhsa`` (see :ref:`amdgpu-target-triples`). See
> + :ref:`amdgpu-amdhsa-code-object-metadata-v2` for the syntax of the code
> + object metadata string.
>
> -.. table:: Attribute encodings
> - :name: amdgpu-dwarf-attribute-encodings-table
> +.. _amdgpu-note-records-v3:
>
> - ================================== ===== ====================================
> - Attribute Name Value Classes
> - ================================== ===== ====================================
> - DW_AT_LLVM_active_lane *TBD* exprloc, loclist
> - DW_AT_LLVM_augmentation *TBD* string
> - DW_AT_LLVM_lanes *TBD* constant
> - DW_AT_LLVM_lane_pc *TBD* exprloc, loclist
> - DW_AT_LLVM_vector_size *TBD* constant
> - ================================== ===== ====================================
> +Code Object V3 Note Records (-mattr=+code-object-v3)
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> -DWARF Expressions
> -~~~~~~~~~~~~~~~~~
> +The AMDGPU backend code object uses the following ELF note record in the
> +``.note`` section when compiling for Code Object V3 (-mattr=+code-object-v3).
>
> -.. note::
> +Additional note records may be present, but any which are not documented here
> +are deprecated and should not be used.
>
> - Rename DWARF Version 5 section 7.7 to reflect the unification of location
> - descriptions into DWARF expressions.
> + .. table:: AMDGPU Code Object V3 ELF Note Records
> + :name: amdgpu-elf-note-records-table-v3
>
> -Operation Expressions
> -+++++++++++++++++++++
> + ======== ============================== ======================================
> + Name Type Description
> + ======== ============================== ======================================
> + "AMDGPU" ``NT_AMDGPU_METADATA`` Metadata in Message Pack [MsgPack]_
> + binary format.
> + ======== ============================== ======================================
>
> -.. note::
> +..
>
> - Rename DWARF Version 5 section 7.7.1 and delete section 7.7.2 to reflect the
> - unification of location descriptions into DWARF expressions.
> + .. table:: AMDGPU Code Object V3 ELF Note Record Enumeration Values
> + :name: amdgpu-elf-note-record-enumeration-values-table-v3
>
> - This augments DWARF Version 5 section 7.7.1 and Table 7.9.
> + ============================== =====
> + Name Value
> + ============================== =====
> + *reserved* 0-31
> + ``NT_AMDGPU_METADATA`` 32
> + ============================== =====
>
> -The following table gives the encoding of the additional DWARF expression
> -operations.
> +``NT_AMDGPU_METADATA``
> + Specifies extensible metadata associated with an AMDGPU code
> + object. It is encoded as a map in the Message Pack [MsgPack]_ binary
> + data format. See :ref:`amdgpu-amdhsa-code-object-metadata-v3` for the
> + map keys defined for the ``amdhsa`` OS.
>
> -.. table:: DWARF Operation Encodings
> - :name: amdgpu-dwarf-operation-encodings-table
> -
> - ================================== ===== ======== ===============================
> - Operation Code Number Notes
> - of
> - Operands
> - ================================== ===== ======== ===============================
> - DW_OP_LLVM_form_aspace_address 0xe1 0
> - DW_OP_LLVM_push_lane 0xe2 0
> - DW_OP_LLVM_offset 0xe3 0
> - DW_OP_LLVM_offset_constu 0xe4 1 ULEB128 byte displacement
> - DW_OP_LLVM_bit_offset 0xe5 0
> - DW_OP_LLVM_call_frame_entry_reg 0xe6 1 ULEB128 register number
> - DW_OP_LLVM_undefined 0xe7 0
> - DW_OP_LLVM_aspace_bregx 0xe8 2 ULEB128 register number,
> - ULEB128 byte displacement
> - DW_OP_LLVM_aspace_implicit_pointer 0xe9 2 4- or 8-byte offset of DIE,
> - SLEB128 byte displacement
> - DW_OP_LLVM_piece_end 0xea 0
> - DW_OP_LLVM_extend 0xeb 2 ULEB128 bit size,
> - ULEB128 count
> - DW_OP_LLVM_select_bit_piece 0xec 2 ULEB128 bit size,
> - ULEB128 count
> - ================================== ===== ======== ===============================
> -
> -Location List Expressions
> -+++++++++++++++++++++++++
> +.. _amdgpu-symbols:
>
> -.. note::
> +Symbols
> +-------
>
> - Rename DWARF Version 5 section 7.7.3 to reflect that location lists are a kind
> - of DWARF expression.
> +Symbols include the following:
>
> -Source Languages
> -~~~~~~~~~~~~~~~~
> + .. table:: AMDGPU ELF Symbols
> + :name: amdgpu-elf-symbols-table
>
> -.. note::
> + ===================== ================== ================ ==================
> + Name Type Section Description
> + ===================== ================== ================ ==================
> + *link-name* ``STT_OBJECT`` - ``.data`` Global variable
> + - ``.rodata``
> + - ``.bss``
> + *link-name*\ ``.kd`` ``STT_OBJECT`` - ``.rodata`` Kernel descriptor
> + *link-name* ``STT_FUNC`` - ``.text`` Kernel entry point
> + *link-name* ``STT_OBJECT`` - SHN_AMDGPU_LDS Global variable in LDS
> + ===================== ================== ================ ==================
>
> - This augments DWARF Version 5 section 7.12 and Table 7.17.
> +Global variable
> + Global variables both used and defined by the compilation unit.
>
> -The following table gives the encoding of the additional DWARF languages.
> + If the symbol is defined in the compilation unit then it is allocated in the
> + appropriate section according to if it has initialized data or is readonly.
>
> -.. table:: Language encodings
> - :name: amdgpu-dwarf-language-encodings-table
> + If the symbol is external then its section is ``STN_UNDEF`` and the loader
> + will resolve relocations using the definition provided by another code object
> + or explicitly defined by the runtime.
>
> - ==================== ====== ===================
> - Language Name Value Default Lower Bound
> - ==================== ====== ===================
> - ``DW_LANG_LLVM_HIP`` 0x8100 0
> - ==================== ====== ===================
> + If the symbol resides in local/group memory (LDS) then its section is the
> + special processor specific section name ``SHN_AMDGPU_LDS``, and the
> + ``st_value`` field describes alignment requirements as it does for common
> + symbols.
>
> -Address Class and Address Space Encodings
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> + .. TODO::
>
> -.. note::
> + Add description of linked shared object symbols. Seems undefined symbols
> + are marked as STT_NOTYPE.
>
> - This replaces DWARF Version 5 section 7.13.
> +Kernel descriptor
> + Every HSA kernel has an associated kernel descriptor. It is the address of the
> + kernel descriptor that is used in the AQL dispatch packet used to invoke the
> + kernel, not the kernel entry point. The layout of the HSA kernel descriptor is
> + defined in :ref:`amdgpu-amdhsa-kernel-descriptor`.
>
> -The encodings of the constants used for the currently defined address classes
> -are given in :ref:`amdgpu-dwarf-address-class-encodings-table`.
> +Kernel entry point
> + Every HSA kernel also has a symbol for its machine code entry point.
>
> -.. table:: Address class encodings
> - :name: amdgpu-dwarf-address-class-encodings-table
> +.. _amdgpu-relocation-records:
>
> - ========================== ======
> - Address Class Name Value
> - ========================== ======
> - ``DW_ADDR_none`` 0x0000
> - ``DW_ADDR_LLVM_global`` 0x0001
> - ``DW_ADDR_LLVM_constant`` 0x0002
> - ``DW_ADDR_LLVM_group`` 0x0003
> - ``DW_ADDR_LLVM_private`` 0x0004
> - ``DW_ADDR_LLVM_lo_user`` 0x8000
> - ``DW_ADDR_LLVM_hi_user`` 0xffff
> - ========================== ======
> +Relocation Records
> +------------------
>
> -Line Number Information
> -~~~~~~~~~~~~~~~~~~~~~~~
> +AMDGPU backend generates ``Elf64_Rela`` relocation records. Supported
> +relocatable fields are:
>
> -.. note::
> +``word32``
> + This specifies a 32-bit field occupying 4 bytes with arbitrary byte
> + alignment. These values use the same byte order as other word values in the
> + AMDGPU architecture.
>
> - This augments DWARF Version 5 section 7.22 and Table 7.27.
> +``word64``
> + This specifies a 64-bit field occupying 8 bytes with arbitrary byte
> + alignment. These values use the same byte order as other word values in the
> + AMDGPU architecture.
>
> -The following table gives the encoding of the additional line number header
> -entry formats.
> +Following notations are used for specifying relocation calculations:
>
> -.. table:: Line number header entry format encodings
> - :name: amdgpu-dwarf-line-number-header-entry-format-encodings-table
> +**A**
> + Represents the addend used to compute the value of the relocatable field.
>
> - ==================================== ====================
> - Line number header entry format name Value
> - ==================================== ====================
> - ``DW_LNCT_LLVM_source`` 0x2001
> - ``DW_LNCT_LLVM_is_MD5`` 0x2002
> - ==================================== ====================
> +**G**
> + Represents the offset into the global offset table at which the relocation
> + entry's symbol will reside during execution.
>
> -Call Frame Information
> -~~~~~~~~~~~~~~~~~~~~~~
> +**GOT**
> + Represents the address of the global offset table.
>
> -.. note::
> +**P**
> + Represents the place (section offset for ``et_rel`` or address for ``et_dyn``)
> + of the storage unit being relocated (computed using ``r_offset``).
>
> - This augments DWARF Version 5 section 7.24 and Table 7.29.
> +**S**
> + Represents the value of the symbol whose index resides in the relocation
> + entry. Relocations not using this must specify a symbol index of
> + ``STN_UNDEF``.
>
> -The following table gives the encoding of the additional call frame information
> -instructions.
> +**B**
> + Represents the base address of a loaded executable or shared object which is
> + the
> diff erence between the ELF address and the actual load address.
> + Relocations using this are only valid in executable or shared objects.
>
> -.. table:: Call frame instruction encodings
> - :name: amdgpu-dwarf-call-frame-instruction-encodings-table
> +The following relocation types are supported:
>
> - ======================== ====== ====== ================ ================ ================
> - Instruction High 2 Low 6 Operand 1 Operand 2 Operand 3
> - Bits Bits
> - ======================== ====== ====== ================ ================ ================
> - DW_CFA_def_aspace_cfa 0 0x2f ULEB128 register ULEB128 offset ULEB128 address space
> - DW_CFA_def_aspace_cfa_sf 0 0x30 ULEB128 register SLEB128 offset ULEB128 address space
> - ======================== ====== ====== ================ ================ ================
> + .. table:: AMDGPU ELF Relocation Records
> + :name: amdgpu-elf-relocation-records-table
>
> -Attributes by Tag Value (Informative)
> --------------------------------------
> + ========================== ======= ===== ========== ==============================
> + Relocation Type Kind Value Field Calculation
> + ========================== ======= ===== ========== ==============================
> + ``R_AMDGPU_NONE`` 0 *none* *none*
> + ``R_AMDGPU_ABS32_LO`` Static, 1 ``word32`` (S + A) & 0xFFFFFFFF
> + Dynamic
> + ``R_AMDGPU_ABS32_HI`` Static, 2 ``word32`` (S + A) >> 32
> + Dynamic
> + ``R_AMDGPU_ABS64`` Static, 3 ``word64`` S + A
> + Dynamic
> + ``R_AMDGPU_REL32`` Static 4 ``word32`` S + A - P
> + ``R_AMDGPU_REL64`` Static 5 ``word64`` S + A - P
> + ``R_AMDGPU_ABS32`` Static, 6 ``word32`` S + A
> + Dynamic
> + ``R_AMDGPU_GOTPCREL`` Static 7 ``word32`` G + GOT + A - P
> + ``R_AMDGPU_GOTPCREL32_LO`` Static 8 ``word32`` (G + GOT + A - P) & 0xFFFFFFFF
> + ``R_AMDGPU_GOTPCREL32_HI`` Static 9 ``word32`` (G + GOT + A - P) >> 32
> + ``R_AMDGPU_REL32_LO`` Static 10 ``word32`` (S + A - P) & 0xFFFFFFFF
> + ``R_AMDGPU_REL32_HI`` Static 11 ``word32`` (S + A - P) >> 32
> + *reserved* 12
> + ``R_AMDGPU_RELATIVE64`` Dynamic 13 ``word64`` B + A
> + ========================== ======= ===== ========== ==============================
>
> -.. note::
> +``R_AMDGPU_ABS32_LO`` and ``R_AMDGPU_ABS32_HI`` are only supported by
> +the ``mesa3d`` OS, which does not support ``R_AMDGPU_ABS64``.
>
> - This augments DWARF Version 5 Appendix A and Table A.1.
> -
> -The following table provides the additional attributes that are applicable to
> -debugger information entries.
> -
> -.. table:: Attributes by tag value
> - :name: amdgpu-dwarf-attributes-by-tag-value-table
> -
> - ============================= =============================
> - Tag Name Applicable Attributes
> - ============================= =============================
> - ``DW_TAG_base_type`` * ``DW_AT_LLVM_vector_size``
> - ``DW_TAG_compile_unit`` * ``DW_AT_LLVM_augmentation``
> - ``DW_TAG_entry_point`` * ``DW_AT_LLVM_active_lane``
> - * ``DW_AT_LLVM_lane_pc``
> - * ``DW_AT_LLVM_lanes``
> - ``DW_TAG_inlined_subroutine`` * ``DW_AT_LLVM_active_lane``
> - * ``DW_AT_LLVM_lane_pc``
> - * ``DW_AT_LLVM_lanes``
> - ``DW_TAG_subprogram`` * ``DW_AT_LLVM_active_lane``
> - * ``DW_AT_LLVM_lane_pc``
> - * ``DW_AT_LLVM_lanes``
> - ============================= =============================
> +There is no current OS loader support for 32-bit programs and so
> +``R_AMDGPU_ABS32`` is not used.
>
> .. _amdgpu-dwarf-debug-information:
>
> @@ -4791,9 +1108,9 @@ DWARF Debug Information
> AMDGPU generates DWARF [DWARF]_ debugging information ELF sections (see
> :ref:`amdgpu-elf-code-object`) which contain information that maps the code
> object executable code and data to the source language constructs. It can be
> -used by tools such as debuggers and profilers. It uses features defined in the
> -:ref:`amdgpu-dwarf-6-proposal-for-heterogeneous-debugging` that are made
> -available in DWARF Version 4 and DWARF Version 5 as an LLVM vendor extension.
> +used by tools such as debuggers and profilers. It uses features defined in
> +:doc:`AMDGPUDwarfProposalForHeterogeneousDebugging` that are made available in
> +DWARF Version 4 and DWARF Version 5 as an LLVM vendor extension.
>
> This section defines the AMDGPU target architecture specific DWARF mappings.
>
> @@ -10658,23 +6975,6 @@ This section describes general syntax for instructions and operands.
> Instructions
> ~~~~~~~~~~~~
>
> -.. toctree::
> - :hidden:
> -
> - AMDGPU/AMDGPUAsmGFX7
> - AMDGPU/AMDGPUAsmGFX8
> - AMDGPU/AMDGPUAsmGFX9
> - AMDGPU/AMDGPUAsmGFX900
> - AMDGPU/AMDGPUAsmGFX904
> - AMDGPU/AMDGPUAsmGFX906
> - AMDGPU/AMDGPUAsmGFX908
> - AMDGPU/AMDGPUAsmGFX10
> - AMDGPU/AMDGPUAsmGFX1011
> - AMDGPUModifierSyntax
> - AMDGPUOperandSyntax
> - AMDGPUInstructionSyntax
> - AMDGPUInstructionNotation
> -
> An instruction has the following :doc:`syntax<AMDGPUInstructionSyntax>`:
>
> | ``<``\ *opcode*\ ``> <``\ *operand0*\ ``>, <``\ *operand1*\ ``>,...
> @@ -11442,24 +7742,23 @@ effort required to accurately calculate GPR usage.
> Additional Documentation
> ========================
>
> -.. [AMD-RADEON-HD-2000-3000] `AMD R6xx shader ISA <http://developer.amd.com/wordpress/media/2012/10/R600_Instruction_Set_Architecture.pdf>`__
> -.. [AMD-RADEON-HD-4000] `AMD R7xx shader ISA <http://developer.amd.com/wordpress/media/2012/10/R700-Family_Instruction_Set_Architecture.pdf>`__
> -.. [AMD-RADEON-HD-5000] `AMD Evergreen shader ISA <http://developer.amd.com/wordpress/media/2012/10/AMD_Evergreen-Family_Instruction_Set_Architecture.pdf>`__
> -.. [AMD-RADEON-HD-6000] `AMD Cayman/Trinity shader ISA <http://developer.amd.com/wordpress/media/2012/10/AMD_HD_6900_Series_Instruction_Set_Architecture.pdf>`__
> .. [AMD-GCN-GFX6] `AMD Southern Islands Series ISA <http://developer.amd.com/wordpress/media/2012/12/AMD_Southern_Islands_Instruction_Set_Architecture.pdf>`__
> .. [AMD-GCN-GFX7] `AMD Sea Islands Series ISA <http://developer.amd.com/wordpress/media/2013/07/AMD_Sea_Islands_Instruction_Set_Architecture.pdf>`_
> .. [AMD-GCN-GFX8] `AMD GCN3 Instruction Set Architecture <http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf>`__
> .. [AMD-GCN-GFX9] `AMD "Vega" Instruction Set Architecture <http://developer.amd.com/wordpress/media/2013/12/Vega_Shader_ISA_28July2017.pdf>`__
> .. [AMD-GCN-GFX10] `AMD "RDNA 1.0" Instruction Set Architecture <https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf>`__
> -.. [AMD-ROCm] `ROCm: Open Platform for Development, Discovery and Education Around GPU Computing <http://gpuopen.com/compute-product/rocm/>`__
> +.. [AMD-RADEON-HD-2000-3000] `AMD R6xx shader ISA <http://developer.amd.com/wordpress/media/2012/10/R600_Instruction_Set_Architecture.pdf>`__
> +.. [AMD-RADEON-HD-4000] `AMD R7xx shader ISA <http://developer.amd.com/wordpress/media/2012/10/R700-Family_Instruction_Set_Architecture.pdf>`__
> +.. [AMD-RADEON-HD-5000] `AMD Evergreen shader ISA <http://developer.amd.com/wordpress/media/2012/10/AMD_Evergreen-Family_Instruction_Set_Architecture.pdf>`__
> +.. [AMD-RADEON-HD-6000] `AMD Cayman/Trinity shader ISA <http://developer.amd.com/wordpress/media/2012/10/AMD_HD_6900_Series_Instruction_Set_Architecture.pdf>`__
> +.. [AMD-ROCm] `AMD ROCm Platform <https://rocm-documentation.readthedocs.io>`__
> .. [AMD-ROCm-github] `ROCm github <http://github.com/RadeonOpenCompute>`__
> -.. [HSA] `Heterogeneous System Architecture (HSA) Foundation <http://www.hsafoundation.com/>`__
> -.. [HIP] `HIP Programming Guide <https://rocm-documentation.readthedocs.io/en/latest/Programming_Guides/Programming-Guides.html#hip-programing-guide>`__
> -.. [ELF] `Executable and Linkable Format (ELF) <http://www.sco.com/developers/gabi/>`__
> +.. [CLANG-ATTR] `Attributes in Clang <https://clang.llvm.org/docs/AttributeReference.html>`__
> .. [DWARF] `DWARF Debugging Information Format <http://dwarfstd.org/>`__
> -.. [YAML] `YAML Ain't Markup Language (YAML™) Version 1.2 <http://www.yaml.org/spec/1.2/spec.html>`__
> +.. [ELF] `Executable and Linkable Format (ELF) <http://www.sco.com/developers/gabi/>`__
> +.. [HRF] `Heterogeneous-race-free Memory Models <http://benedictgaster.org/wp-content/uploads/2014/01/asplos269-FINAL.pdf>`__
> +.. [HSA] `Heterogeneous System Architecture (HSA) Foundation <http://www.hsafoundation.com/>`__
> .. [MsgPack] `Message Pack <http://www.msgpack.org/>`__
> -.. [SEMVER] `Semantic Versioning <https://semver.org/>`__
> .. [OpenCL] `The OpenCL Specification Version 2.0 <http://www.khronos.org/registry/cl/specs/opencl-2.0.pdf>`__
> -.. [HRF] `Heterogeneous-race-free Memory Models <http://benedictgaster.org/wp-content/uploads/2014/01/asplos269-FINAL.pdf>`__
> -.. [CLANG-ATTR] `Attributes in Clang <https://clang.llvm.org/docs/AttributeReference.html>`__
> +.. [SEMVER] `Semantic Versioning <https://semver.org/>`__
> +.. [YAML] `YAML Ain't Markup Language (YAML™) Version 1.2 <http://www.yaml.org/spec/1.2/spec.html>`__
>
> diff --git a/llvm/docs/UserGuides.rst b/llvm/docs/UserGuides.rst
> index 5673ae65cce9..af0d5ade66bf 100644
> --- a/llvm/docs/UserGuides.rst
> +++ b/llvm/docs/UserGuides.rst
> @@ -192,4 +192,8 @@ Additional Topics
> This document describes using the NVPTX backend to compile GPU kernels.
>
> :doc:`AMDGPUUsage`
> - This document describes using the AMDGPU backend to compile GPU kernels.
> \ No newline at end of file
> + This document describes using the AMDGPU backend to compile GPU kernels.
> +
> +:doc:`AMDGPUDwarfProposalForHeterogeneousDebugging`
> + This document describes a DWARF proposal to support heterogeneous debugging
> + for targets such as the AMDGPU backend.
> \ No newline at end of file
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list