[llvm-dev] DW_OP_implicit_pointer design/implementation in general

Tue Nov 19 09:41:25 PST 2019

> On Nov 18, 2019, at 8:33 AM, Jeremy Morse <jeremy.morse.llvm at gmail.com> wrote:
> 
> Hi llvm-dev@,
> 
> Switching focus to the LLVM implementation, the significant change is
> using dbg.value's first operand to refer to a DILocalVariable, rather
> than a Value. There's some impedance mismatch here, because all the
> documentation (for example in the DbgVariableIntrinsic class)
> expresses everything in terms of the variables location, whereas
> implicit pointers don't have a location as they represent an extra
> level of indirection. This is best demonstrated by the change to
> IntrinsicInst.cpp in this patch [0] -- calling getVariableLocation on
> any normal dbg.value will return the locations Value, but if it's an
> implicit pointer then you'll get the meaningless MetadataAsValue
> wrapper back instead. This isn't the variable location, might surprise
> existing handlers of dbg.values, and just seems a little off.
> 
> I can see why this route has been taken, but by putting a non-Value in
> dbg.value's, it really changes what dbg.values represent, a variable
> location in the IR. Is there any appetite out there for using a
> different intrinsic, something like 'dbg.loc.implicit', instead of
> using dbg.value? IMO it would be worthwhile to separate:
> * Debug intrinsics where their position in the IR is important, from
> * Debug intrinsics where both their position in the IR, _and_ a Value
> in the IR, are important.
> Of which (I think) implicit pointers are the former, and current [2]
> dbg.values are the latter. This would also avoid putting
> DW_OP_implicit_pointer into expressions in the IR, pre-isel at least.
> 

On that particular point, I would like to see is a generalization of dbg.value: Currently llvm.dbg.value binds an SSA value (including constants and undef) and a DIExpression to a DILocalVariable at a position in the instruction stream. That first SSA value argument is an implicit first element in the DIExpression.

A more general form would be a more printf-like signature:

llvm.dbg.value(DILocalVariable, DIExpression, ...)

for example

llvm.dbg.value_new(DILocalVariable("x"), DIExpression(DW_OP_LLVM_arg0), %x)
llvm.dbg.value_new(DILocalVariable("y"), DIExpression(DW_OP_LLVM_arg0, DW_OP_LLVM_arg1, DW_OP_plus),
                   %ptr, %ofs)
llvm.dbg.value_new(DILocalVariable("z"), DIExpression(DW_OP_implicit_pointer, DW_OP_LLVM_arg0, 32),
                   DILocalVariable("base"))
llvm.dbg.value_new(DILocalVariable("c"), DIExpression(DW_OP_constu, 1))

The mandatory arguments would be the variable and the expression, and an arbitrary number of SSA values and potentially other variables.

As far as DW_OP_LLVM_implicit_pointer in particular is concerned, we could also treat the peculiarities of DW_OP_implicit_pointer as a DWARF implementation detail, introduce DW_OP_LLVM_implicit_pointer which transforms the top-of-stack into an implicit pointer (similar to DW_OP_stack_value) and have the DWARF backend insert an artificial variable on the fly.

LLVM IR:

llvm.dbg.value(%base, DILocalVariable("z"), DIExpression(DW_OP_LLVM_implicit_pointer))

AsmPrinter would expand this into two DW_TAG_variable tags with one location (list) entry each.

-- adrian

> There's also Vedants suggestion [1] for linking implicit pointer
> locations with the dbg.values of the underlying DILocalVariable. I
> suspect the presence of control flow might make it difficult (there's
> no dbg.phi instruction), but I like the idea of having more explicit
> links in the IR, it would be much clearer to interpret what's going
> on.
> 
> [0] https://reviews.llvm.org/D69999?id=229790
> [1] https://reviews.llvm.org/D69886#1736182
> [2] Technically dbg.value(undef,...) is the former too, I guess.
> 
> --
> Thanks,
> Jeremy