[PATCH] D83495: [DebugInfo] Add DWARF emission for DBG_VALUE_LIST

Scott Linder via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Jan 17 12:18:24 PST 2021


scott.linder added inline comments.


================
Comment at: llvm/test/DebugInfo/X86/dbg_value_list_emission.mir:51-55
+    ; (1) Check a single reg arg works.
+    DBG_VALUE_LIST !12, !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_stack_value), $eax, debug-location !15
+    ; CHECK:      DW_TAG_variable
+    ;  CHECK-NEXT:   (DW_OP_breg0 RAX+0, DW_OP_stack_value)
+    ;  CHECK-NEXT:   DW_AT_name ("locala")
----------------
StephenTozer wrote:
> scott.linder wrote:
> > Why does this behave differently than (what I understand to be) the equivalent `DBG_VALUE` ?
> > 
> > ```
> > DBG_VALUE $eax, $noreg, !12, !DIExpression(DW_OP_stack_value), debug-location !15
> > ; becomes: DW_AT_location (DW_OP_breg0 RAX+0, DW_OP_constu 0xffffffff, DW_OP_and, DW_OP_stack_value)
> > ```
> > 
> > It seems like the `DW_OP_and` is there to select a subregister (I assume EAX), but oddly it comes after the value of the register is already read (i.e. after the DW_OP_breg). I'm lost on what the intended behavior is, and why it differs between `DBG_VALUE` and `DBG_VALUE_LIST`.
> > 
> > There is also the existing confusion around the "isIndirect" flag in `DBG_VALUE` which makes these two equivalent (and both seemingly wrong):
> > 
> > ```
> > DBG_VALUE $eax, $noreg, !25, !DIExpression(DW_OP_stack_value), debug-location !15
> > ; becomes: DW_AT_location    (DW_OP_breg0 RAX+0, DW_OP_constu 0xffffffff, DW_OP_and, DW_OP_stack_value)
> >   
> > DBG_VALUE $eax, 0, !26, !DIExpression(DW_OP_stack_value), debug-location !15
> > ; becomes: DW_AT_location    (DW_OP_breg0 RAX+0, DW_OP_constu 0xffffffff, DW_OP_and, DW_OP_stack_value)
> > ```
> > 
> > which makes it harder still to compare.
> > 
> > 
> > Would it be more straightforward to always be explicit about indirection in the new form? Why does `DW_OP_stack_value` imply a `DW_OP_deref` at all? I.e. why do we not get:
> > 
> > ```
> > DBG_VALUE_LIST !12, !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_stack_value), $eax, debug-location !15
> >     ; CHECK:      DW_TAG_variable
> >     ;  CHECK-NEXT:   (DW_OP_reg RAX, DW_OP_stack_value)
> >     ;  CHECK-NEXT:   DW_AT_name ("locala")
> > ```
> > 
> > which in this case I imagine would just be an error. I would expect the correct expression to generate the `DW_OP_breg` would be something like:
> > 
> > ```
> > DBG_VALUE_LIST !12, !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_deref, DW_OP_stack_value), $eax, debug-location !15
> >     ; CHECK:      DW_TAG_variable
> >     ;  CHECK-NEXT:   (DW_OP_breg0 RAX+0, DW_OP_stack_value)
> >     ;  CHECK-NEXT:   DW_AT_name ("locala")
> > ```
> > 
> > If we don't do this, we seem to retain some of the same ambiguity that makes the old "isIndirect" field so confusing.
> To the first point: I'm looking into it now; I noticed the `DW_OP_and` before, but I'm not sure where it's coming from myself yet - the `DBG_VALUE_LIST` handling should be following essentially the same code path as `DBG_VALUE`, so this is a bug one way or another.
> 
> To the second point, I think your examples are slightly incorrect:
> 
> The `isIndirect` flag in `DBG_VALUE` is confusing and inconsistent, what it actually does is dependent on the DIExpression and not well explained. The `DBG_VALUE_LIST` implementation has no such inconsistencies however (I hope). The problem with your examples is that I think you're using `DW_OP_reg` to mean a register's literal value, and `DW_OP_breg` to mean the address pointed to by a register. This isn't quite correct, although they do act like that for most variable locations.
> 
> The actual meanings are slightly more complicated; the short answer is that `DW_OP_reg` is a Register location description: it refers to the register itself, not to the value of that register. `DW_OP_breg` on the other hand //does// refer to the literal value of a register; it's generally used with an offset as part of a Memory location expression, but if combined with `DW_OP_stack_value` then it gives the value in the register as the variable's value (albeit as an Implicit location rather than a Register location).
> 
> So with all of that said, the meaning of `(DW_OP_breg0 RAX+0, DW_OP_stack_value)` is that the variable's value can be found in `$rax`, but the variable should be read-only, which matches the meaning of the `DBG_VALUE_LIST`.
> The isIndirect flag in DBG_VALUE is confusing and inconsistent, what it actually does is dependent on the DIExpression and not well explained.

Agreed, and my point is that a similar issue applies to the interpretation of `DW_OP_LLVM_arg` in your patch, even with `isIndirect` gone (modulo the `assert` mentioned below, which only side-steps the issue).

> The problem with your examples is that I think you're using DW_OP_reg to mean a register's literal value, and DW_OP_breg to mean the address pointed to by a register.

That wasn't my intention in the examples I gave; in fact in the unambiguous model we tried to extract from the DWARF spec (https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html) we define the `DW_OP_reg` as pushing a register location description onto the stack, and `DW_OP_breg` as pushing a memory location description onto the stack. The interaction which makes the location description pushed onto the stack by `DW_OP_breg` behave like a value in some contexts (i.e. behave like the offset contents of the register) we begrudgingly capture with an implicit conversion to make our description backwards-compatible with DWARF 4 and 5, but *even if* you want to just define it as pushing a value onto the stack, at the very least the `breg` must represent reading the value of a `reg` register location. My question is then: why does adding `DW_OP_stack_value` implicitly cause the *value* of the register to end up on the stack, with no intervening operation? I.e. why is this the case:

```
    ; Sure, seems reasonable: `DW_OP_LLVM_arg` when referring to a register describes the register itself, not the value of the register.
    DBG_VALUE_LIST !12, !DIExpression(DW_OP_LLVM_arg, 0), $eax, debug-location !11
    ; CHECK: DW_AT_location    (DW_OP_reg0 RAX)

    ; This follows too: if you do want the value of the register, you can read it explicitly with e.g. `DW_OP_deref`. DWARF actually *requires* this be collapsed into something like `DW_OP_breg` or `DW_OP_regval_type`,
    ; as `DW_OP_reg RAX, DW_OP_deref` is not a valid location description of any kind. This is an artificial constraint of the standard, and in any consistent view of the spec the two forms would have to otherwise be equivalent.
    DBG_VALUE_LIST !13, !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_deref), $eax, debug-location !11
    ; CHECK: DW_AT_location    (DW_OP_breg0 RAX+0)

    ; Hmm, this doesn't seem right though: why are there now two indirections? Does `DW_OP_stack_value` imply one for some reason?
    DBG_VALUE_LIST !14, !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_deref, DW_OP_stack_value), $eax, debug-location !11
    ; CHECK: DW_AT_location    (DW_OP_breg0 RAX+0, DW_OP_deref, DW_OP_stack_value)

    ; If you actually want the singly-indirect output you *have* to omit the semantically consistent `DW_OP_deref`. I would have expected this to just be an invalid DIExpression:
    DBG_VALUE_LIST !15, !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_stack_value), $eax, debug-location !11
    ; CHECK: DW_AT_location    (DW_OP_breg0 RAX+0, DW_OP_stack_value)
```

Note that I removed the asserts in your patch which seem to artificially require `DW_OP_stack_value` for `DBG_VALUE_LIST`. I didn't understand the purpose of them before, but perhaps this issue is one reason for them to be present?

My fundamental argument is that this context-dependent interpretation of `DW_OP_LLVM_arg` is another source of confusion, just like `isIndirect`. I think this stems from the fact that DWARF as it is defined today is not general/composable enough to avoid this, but I don't think that should bleed into the internal representation used by LLVM: we can make a sensible choice up until we get into the DWARF backend, where certain expressions will have to be converted into a different form to be legal. Instead, what we have now is a situation where adding operations to the expression changes fundamentally how you are supposed to interpret the `DW_OP_LLVM_arg`. This is a pre-existing shortcoming, just like `isIndirect`, so saddling you with the burden of correcting it doesn't seem reasonable, but I think it is important to discuss. This is also important in the context of replacing `DBG_VALUE` entirely, as the `assert` will obviously need to go away.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83495/new/

https://reviews.llvm.org/D83495



More information about the llvm-commits mailing list