[PATCH] D67492: [DebugInfo] Add a DW_OP_LLVM_entry_value operation

Wed Sep 25 09:44:52 PDT 2019

dstenb marked an inline comment as done.
dstenb added inline comments.

================
Comment at: llvm/test/Verifier/diexpression-valid-entry-value.ll:5
 ; CHECK-NOT: invalid expression
-!0 = !DIExpression(DW_OP_entry_value, 1)
+!0 = !DIExpression(DW_OP_LLVM_entry_value, 1)
----------------
dstenb wrote:
> djtodoro wrote:
> > djtodoro wrote:
> > > aprantl wrote:
> > > > aprantl wrote:
> > > > > dstenb wrote:
> > > > > > djtodoro wrote:
> > > > > > > I am just curious. Having the D67768, do we really still need the size of following expression at this point?
> > > > > > I guess that the size operand is currently redundant, and we can omit it. However, I have made the assumption that specifying the size will simplify things when we get to describing more complex variables like `local` in the example below using entry values:
> > > > > > 
> > > > > > ```
> > > > > > int foo(int param) {
> > > > > >   int local = param + 1;
> > > > > >   use_of_param = param;
> > > > > >   clobber();
> > > > > >   return 0;
> > > > > > }
> > > > > > ```
> > > > > > 
> > > > > > Here `local` will have the entry value of $edi + 1. GDB will not understand a `DW_OP_entry_value` whose block is that complex, so I imagine that we want to emit the same expression as GCC:
> > > > > > 
> > > > > > ```
> > > > > > (DW_OP_entry_value: (DW_OP_reg5 (rdi)); DW_OP_plus_uconst: 1; DW_OP_stack_value)
> > > > > > ```
> > > > > > 
> > > > > > I guess that the `DW_OP_LLVM_entry_value` operation could cover the whole expression in the DIExpression world, and we then wrap only the parts that actually needs to be covered by entry values when we lower that to `DW_OP_entry_value` in DwarfExpression. However, I'm a bit afraid that that would further complicate the DwarfExpression state machine, which is already quite complex.
> > > > > > 
> > > > > > Looking further onwards, I guess that we want to transform the current `dbg.value`/`DBG_VALUE`/`DIExpression` framework to something that allows us to describe a variable using multiple register / memory values. For example, like `local` in the example below:
> > > > > > 
> > > > > > ```
> > > > > > int bar(int a, int b) {
> > > > > >   int local = a + b;
> > > > > >   [...]
> > > > > > }
> > > > > > ```
> > > > > > 
> > > > > > I assume that we want to be able to describe such variables using entry values for only a subset of the registers. GCC is able to emit such expressions. I guess that having a `DW_OP_LLVM_entry_value` operation that only applies for a part of an `DIExpression` will be helpful when we get there, but that's pure speculation as I don't know how the debug framework will look.
> > > > > IMO, we should make the IR support as generic as possible and accept more complex uses of DW_OP_LLVM_entry_value that LLVM currently cannot produce itself and generate the correct DWARF for it.
> > > > > 
> > > > > In order to generate that particular example though, it might be useful to add another operand (let's call it DW_OP_LLVM_dbg_value) to make the implicit first operand explicit, so this could be
> > > > > 
> > > > > ```
> > > > > !DIExpression(DW_OP_LLVM_entry_value, 1, DW_OP_LLVM_dbg_value, DW_OP_plus, DW_OP_uconst, 1, DW_OP_stack_value).
> > > > > ```
> > > > > 
> > > > > and the backend substitutes `DW_OP_breg5` for `DW_OP_dbg_value`. In the future, we could further generalize llvm.dbg.value to take extra arguments that would show up as `DW_OP_LLVM_dbg_value_[0-9]+` to allow binding more than one SSA value to a DIExpression.
> > > > (and of course to confuse everybody I accidentally flipped the order of DW_OP_plus and DW_OP_uconst in my example...)
> > > >Here local will have the entry value of $edi + 1. GDB will not understand a DW_OP_entry_value whose block is that complex, so I imagine that we want to emit the same expression as GCC:
> > > Sure. Not only local variables, I am currently extending the `LiveDebugValues` to support such scenario when parameters are part of such expression and I will share that in the following days. Let's keep it for now.
> > > 
> > > Let me say something about extending the `llvm.dbg.value`, I already got similar idea to extend the `salvageDebugInfo()` functionality when we are removing an instruction operating on two registers (rather then reg and constant), so I think that having that type of instruction will give us a lot of benefits, not only in situations like that. :)
> > >I guess that the DW_OP_LLVM_entry_value operation could cover the whole expression in the DIExpression world, and we then wrap only the parts that actually needs to be covered by entry values when we lower that to DW_OP_entry_value in DwarfExpression. However, I'm a bit afraid that that would further complicate the DwarfExpression state machine, which is already quite complex.
> > 
> > Having this pre-calculation interface at this point for entry values expressions won't be desirable for situations where entry value  is just part of complex expression like you described, since that the `DwarfExpression` won't be able to distinguish what is entry value part and what is the rest of the complex expression (e.g. `(DW_OP_entry_value: (DW_OP_reg5 (rdi)); DW_OP_plus_uconst: 1; DW_OP_stack_value)` where the `DIExpression` would look as `DIExpression(DW_OP_LLVM_entry_value, 1, DW_OP_plus_uconst, 1, DW_OP_stack_value)`).
> > I think it would be ideally if we were to somehow create the right `size` operand from the entry value `DIExpression` when the entry value gets created during the `LiveDebugValues` (we agree that it is very hard, or even impossible), but we would know what part of a complex expression is the "entry value" part and what part is not. Therefore, in the current situation I don't think that the size operand (from entry value `DIExpresion`) has any impact on simplifying the support for the complex expressions like we described. In that situation we should just see if the `DIExpression` contains any additional `DWARF` operands and mark it as "non-entry value" part and somehow make another buffer for that until the entry value part gets printed via current interface.
> > won't be able to distinguish what is entry value part and what is the rest of the complex expression
> 
> I think that is possible with this semantics. In this patch the operand specifies that the value operand and the (N - 1) number of operations are covered by the entry value, so for N > 1 shouldn't we be able to finish the entry value after emitting those remaining (N - 1) operations in `DwarfExpression::addExpression()`?
> 
> In your example N is 1, so we should only emit the entry value around the register. I think we need to modify the current implementation of `DwarfExpression::addMachineRegExpression()` a bit so that it emits a `DW_OP_regx` rather than taking the code path where the expression is optimized into a `DW_OP_breg $reg, 1` [0]. We also need to make sure that the `DW_OP_stack_value` is emitted at the end of the whole expression, rather than after the entry value, which is currently done [1]. I think that with those changes it should be possible to emit your example using the semantics that this patch introduces, at the risk of overlooking some other issue.
> 
> I still haven't been able to think of an example where the DW_OP_LLVM_entry_value's size operand being larger than 1 would actually be useful though.
> 
> [0] https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp#L249
> [1] https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp#L258
Perhaps something along the lines of this: D68034. The location kind handling likely needs more thought though.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67492/new/

https://reviews.llvm.org/D67492