[llvm-dev] DW_OP_implicit_pointer design/implementation in general

Thu Jul 30 04:59:58 PDT 2020

Hi,

I've taken a look at the patches (thanks Alok) and will submit
comments in a bit,

David wrote:
> I haven't had a chance to page in all the old context, nor look at the
> new ones in detail yet. But probably worth keeping high level design
> review here, I think? Once the general direction seems good, we can go
> into the separate review threads for the implementation/mechanical
> details.

I think the latest patch series matches what came out of the
discussion above, as you described it:

> I would expect this to be handled with a general OP saying "hey, I'm
> skipping one level of indirection indirection in the resulting value,
> because that indirection is missing/not in the final program" and that this
> would be encoded in a llvm.dbg.value/DIExpression as usual, without the
> need for new IR intrinsics, though possibly with the need for an LLVM
> extension DWARF OP (DW_OP_LLVM_explicit_pointer?)

That's what's been implemented, whenever an alloca is promoted,
variable locations that used the allocas address are transformed into
promoted-value variable locations in the usual way, but with a
DW_OP_LLVM_explicit_pointer at the front of the expression to indicate
"the pointer is absent, but this is what it would have pointed at".
Simple case:

  i32 *%foo = alloca i32
  dbg.declare(%foo, !123, !DIExpression())
  dbg.value(%foo, !456, !DIExpression())
  store i32 0, i32 *%foo

Where !123 is a plain i32 source variable, and !456 a pointer-to-i32
source variable. When %foo is promoted, these would become:

  dbg.value(i32 0, !123, !DIExpression())
  dbg.value(i32 0, !456, !DIExpression(DW_OP_LLVM_explicit_pointer))

When it comes to the IR way of modelling these things, I think that
this matches the discussion, and is a lightweight way of representing
what's going on.

I have some reservations about further down the compiler though:
artificial variables get created at isel time, which seems early to
me, and duplicates the work for each instruction selector. Is there a
reason why it can't be done in the DWARF emitter? The artificial
variables are also tracked with additional DBG_VALUE instructions, if
we could push artificial variable creation back to emission time then
we wouldn't have to answer questions such as "what is the lifetime of
a DBG_VALUE of an artificial variable?"

At promotion time: some of the handling of variable promotion appears
to happen within Instruction::eraseFromParent, which seems out of
place. I reckon you've missed the calls in PromoteMemoryToRegister.cpp
to the ConvertDebugDeclareToDebugValue helpers -- shifting the
promotion handling there would be better, and not dependent on the
order that things are erased in. I think those ConvertDebug... helper
functions and the two other functions you've instrumented in the same
file should be sufficient to catch all promotions.

Additionally, I believe that promoted allocas are getting
DW_OP_LLVM_explicit_pointer dbg.values generated for any pointer that
_ever_ points at it. You'll need to consider circumstances where
pointer variables have multiple values, i.e.:

  int foo, bar, baz;
  int *qux = &foo;
  qux = &bar;
  qux = &baz;
  foo = 1;
  bar = 2;
  baz = 3;

If I understood the code correctly, 'qux' will have implicit-pointer
values for each of the assignments to foo / bar / baz, where it should
only have a dbg.value for the assignment to 'baz'. (It might be
alright to limit handling to scenarios where a pointer variable only
ever has one value, and then expand what can be handled later).

--
Thanks,
Jeremy