[PATCH] D138869: [Docs][RFC] Add AMDGPU LLVM Extensions for Heterogeneous Debugging

Tue Nov 29 12:31:28 PST 2022

scott.linder added a comment.

In D138869#3956641 <https://reviews.llvm.org/D138869#3956641>, @probinson wrote:

> This is a very large document and I admit I've only briefly skimmed it. But I wonder whether the redesign of DIExpr in terms of new DIOpXyz (instead of DW_OP_Xyz) is introducing too many trees to let the forest be visible.  Finding a way to split this up into more digestible and reviewable chunks would be really wonderful.  The DIExpr redesign in particular (I'd expect) would be incredibly intrusive to implement, and I think you'd need to demonstrate that this redesign is absolutely necessary (cannot be implemented with the existing DWARF-like expressions perhaps augmented with some new DW_OP_LLVM_yyy operators).

The biggest reason for the switch is to enable a more type-safe, extensible, and ergonomic interface to the expressions. As it stands, there is no clean abstraction for the expression: it is transparently just a `std::vector<uint64_t>`. There are some methods on `DIExpression` and some free functions in `DebugInfoMetadata.h` that can more-or-less spare the developer from having to deal directly in this representation, but it is leaky abstraction.

Even if we discount the benefits in ergonomics, there is still an issue with composability: because the fundamental unit of abstraction is a `DIExpression *` and we can only operate on it via methods and free functions, those must return a fully `unique`d `DIExpression *`, even if the user intends to compose many operations to arrive at the final expression they want. For example, in `PrologEpilogInserter.cpp`:

  unsigned PrependFlags = DIExpression::ApplyOffset;
  if (!MI.isIndirectDebugValue() && !DIExpr->isComplex())
    PrependFlags |= DIExpression::StackValue;
  if (MI.isIndirectDebugValue() && DIExpr->isImplicit()) {
    SmallVector<uint64_t, 2> Ops = {dwarf::DW_OP_deref_size, Size};
    bool WithStackValue = true;
    DIExpr = DIExpression::prependOpcodes(DIExpr, Ops, WithStackValue);
    // Make the DBG_VALUE direct.
    MI.getDebugOffset().ChangeToRegister(0, false);
  }
  DIExpr = TRI.prependOffsetExpression(DIExpr, PrependFlags, Offset);

The intermediate `DIExpression *` returned from the call to `prependOpcodes` goes through the whole `unique`ing mechanism and is then immediately discarded once used as input to `prependOffsetExpression`. The caller (who just wants to write their pass, not fiddle with debug info) either needs to go factor out the implementation of these two methods into something that operates on `std::vector<uint64_t>`, and then combine them into a new `prependOpcodesAndThenPrependOffsetExpression`, or else expose the clunky `std::vector<uint64_t>` representation directly in their pass code. I don't think either solution is acceptable, which means we just have objectively worse code every time. For example, this code in `TargetInstrInfo.cpp` deals with `std::vector<uint64_t>` to avoid the performance cost:

  SmallVector<uint64_t, 8> Ops;

Our solution was to introduce an abstraction for both:

- An "at rest", `unique`d expression (`DIExpr`)
- A "mutable" expression builder (`DIExprBuilder`)

Currently we have the same underlying representation for both, so going from `DIExpr` to `DIExprBuilder` is a `memcpy` and going from `DIExprBuilder` to `DIExpr` is a pointer copy, but we believe this leaves us the room to improve e.g. the compactness of the "at rest" representation at the expense of some extra work to interconvert, and we will be able to do this without changing any client code.

Once we arrived at this design, we realized that even if we stuck with `DIExpression` and added a `DIExpressionBuilder` we needed to disrupt nearly all code which deals with expressions to use the new interfaces. At that point, we decided it was worth also striving for the ergonomic benefits of doing away with the `std::vector<uint64_t>` completely.

  if 
  if (MI.isIndirectDebugValue() && DIExpr->isImplicit()) {
    SmallVector<uint64_t, 2> Ops = {dwarf::DW_OP_deref_size, Size};
    bool WithStackValue = true;
    DIExpr = DIExpression::prependOpcodes(DIExpr, Ops, WithStackValue);
    // Make the DBG_VALUE direct.
    MI.getDebugOffset().ChangeToRegister(0, false);
  }
  DIExpr = TRI.prependOffsetExpression(DIExpr, PrependFlags, Offset);

> Based on the Dev Meeting talk, the explicit notions of fragment and lifetime sounded appealing, and worth looking at independent of any "heterogeneous" aspect.  AMDGPU's needs might have been the motivation, but the problems addressed by these notions are generic. I'd expect these solutions should be broadly applicable.  It's especially exciting to see ways forward to support multiple concurrent locations, something that already comes up as a problem and would greatly benefit the debugging experience even in a scalar or homogeneous environment.

================
Comment at: llvm/docs/AMDGPULLVMExtensionsForHeterogeneousDebugging.rst:44
+information expressions to reflect code transformations. The LLVM extensions’
+changes are possible as LLVM has no requirement for backwards compatibility, nor
+any requirement that the intermediate representation of debug information
----------------
probinson wrote:
> I wouldn't go that far.  LLVM does in fact promise to be able to read older versions of bitcode, necessarily including older versions of debug-info metadata.  so at the very least you'd need to be able to auto-upgrade older metadata into the new format.
It may be too strong, but the idea was that it is currently always "correct" to drop debug information. This is probably something that warrants a broader discussion, as that is admittedly a pretty unhelpful approach for anyone actually relying on these upgrades for anything with debug information, but I don't see the use case for upgrading debug info in bitcode at rest.

To be more precise, the situations I see are:

* Source is available, in which case the next run of the front-end "upgrades" you to the new debug info metadata; no compatibility concerns.
* Source is not available, in which case I don't immediately see what "debug info" would even be describing? Are there examples of bitcode maintained directly (i.e. "by hand") that also includes debug information? Is that debug information describing some hypothetical source language? Is it describing LLVM bitcode itself, as that is the source language?

Again, I don't doubt that there are cases where the upgrade is needed, I just can't identify them myself.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138869/new/

https://reviews.llvm.org/D138869