[llvm-dev] RFC: Introduce DW_OP_LLVM_memory to describe variables in memory with dbg.value

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Wed Sep 6 11:00:32 PDT 2017


On Wed, Sep 6, 2017 at 10:50 AM Robinson, Paul <paul.robinson at sony.com>
wrote:

> It's worth remembering that there are two syntactically similar but
> semantically different kinds of "expression" in DWARF.
>
> A DWARF expression computes a value; if the available value is a pointer,
> you add DW_OP_deref to express the pointed-to value.  A DWARF location
> expression computes a location, and adds various operators to express
> locations that a (value) expression cannot, such as DW_OP_regx.  You also
> have DW_OP_stack_value to say "just kidding, this location expression is a
> value expression."
>
> So, whether we want to start throwing around deref or stack_value or regx
> (implicit or explicit) really depends on whether we are going to be using
> value expressions or location expressions.  Let's not start mixing them up,
> it will just make the discussion more confusing.
>

Where do non-location DWARF expressions appear?


> --paulr
>
>
>
> *From:* llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] *On Behalf Of *David
> Blaikie via llvm-dev
> *Sent:* Wednesday, September 06, 2017 10:02 AM
> *To:* Reid Kleckner; llvm-dev
>
>
> *Subject:* Re: [llvm-dev] RFC: Introduce DW_OP_LLVM_memory to describe
> variables in memory with dbg.value
>
>
>
> On Tue, Sep 5, 2017 at 1:00 PM Reid Kleckner via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Debug info today handles two cases reasonably well:
>
> 1. At -O0, dbg.declare does a good job describing variables that live at
> some known stack offset
>
> 2. With optimizations, variables promoted to SSA can be described with
> dbg.value
>
>
>
> This leaves behind a large hole in our optimized debug info: variables
> that cannot be promoted, typically because they are address-taken. This is
> https://llvm.org/pr34136, and this RFC is mostly about addressing that.
>
>
>
> The status today is that instcombine removes all dbg.declares and
> heuristically inserts dbg.values where it can identify the value of the
> variable in question. This prevents us from having misleading debug info,
> but it throws away information about the variable’s location in memory.
>
>
>
> Part of the reason that instcombine discards dbg.declares is that we can’t
> mix and match dbg.value with dbg.declare. If the backend sees a
> dbg.declare, it accepts that information as more reliable and discards all
> DBG_VALUE instructions associated with that variable. So, we need something
> we can mix. We need a way to say, the variable lives in memory *at this
> program point*, and it might live somewhere else later on. I propose that
> we introduce DW_OP_LLVM_memory for this purpose, and then we transition
> from dbg.declare to dbg.value+DW_OP_LLVM_memory.
>
>
>
> Initially I believed that DW_OP_deref was the way to say this with
> existing DWARF expression opcodes, but I implemented that in
> https://reviews.llvm.org/D37311 and learned more about how DWARF
> expressions work. When a debugger begins evaluating a DWARF expression, it
> assumes that the resulting value will be a pointer to the variable in
> memory. For a debugger, this makes sense, because debug builds put things
> in memory and even after optimization many variables must be spilled. Only
> the special DW_OP_regN and DW_OP_stack_value expression opcodes change the
> location of the value from memory to register or stack value.
>
>
>
> LLVM SSA values obviously do not have an address that we can take and they
> don’t live in registers, so neither the default memory location model nor
> DW_OP_regN make sense for LLVM’s dbg.value. We could hypothetically
> repurpose DW_OP_stack_value to indicate that the SSA value passed to
> llvm.dbg.value *is* the variable’s value, and if the expression lacks
> DW_OP_stack_value, it must be a the address of the value. However, that is
> backwards incompatible and it seems like quite a stretch.
>
>
>
> Seems like a stretch in what sense? The backwards incompatibility is
> certainly something to consider (though we went through that with
> DW_OP_bit_piece too), but this seems like the design I'd go to first so I'd
> like to better understand why it's not the path forward if there's some
> more detail about that aspect of the design choice here.
>
> I guess you described this already, but talking it through for
> myself/maybe others will find this useful:
>
> So since we don't have DW_OP_regN for LLVM registers, we could sort of
> assume the implicit first value on the stack is a pseudo-OP_regN of the
> LLVM SSA register.
>
> To support that, all existing uses would need no changes to match the
> DWARF model of registers being implicitly direct values.
>
> Code that wanted to describe the register as containing the memory address
> of the interesting thing would use DW_OP_stack_value to say "this location
> description that is a register is really an address you should follow to
> find the value, not a direct value itself"?
>
> But code that wanted to describe a variable as being 3 bytes ahead of a
> pointer in an LLVM SSA register would only have "plus 3" in the expression
> stack, since then it's no longer a direct value but is treated as a pointer
> to the value. I guess this is where the ambiguity would come in - currently
> how does "plus 3" get interpreted when seen in LLVM IR, I guess that's
> meant to describe reg value + 3 as being the immediate value of the
> variable? (so it's implicitly OP_stack_value? & OP_stack_value is added
> somewhere in the DWARF backend?)
>
> Thanks,
> - Dave
>
>
>
>
>
> DW_OP_LLVM_memory would be very similar to DW_OP_stack_value, though. It
> would only be valid at the end of a DIExpression. The backend will always
> remove it because the debugger will assume the variable lives in memory
> unless it is told otherwise.
>
>
>
> For the original problem of improving optimized debug info while avoiding
> inaccurate information in the presence of dead store elimination, consider
> this C example:
>
>   int x = 42;  // Can DSE
>
>   dostuff(x); // Can propagate 42
>
>   x = computation();  // Post-dominates `x = 42` store
>
>   escape(&x);
>
>
>
> We should be able to do this:
>
>   int x; // eliminate `x = 42` store
>
>   dbg.value(!x, 42, !DIExpression()) // mark x as the constant 42 in debug
> info
>
>   dostuff(42); // propagate 42
>
>   dbg.value(!x, &x, !DIExpression(DW_OP_LLVM_memory)) // x is in memory
> again
>
>   x = computation();
>
>   escape(&x);
>
>
>
> Passes that delete stores would be responsible for checking if the store
> destination is part of an alloca with associated dbg.value instructions.
> They would emit a new dbg.value instruction for that variable with the
> stored value, and clone the dbg.value instruction that puts the variable
> back in memory before the killing store. If the store is dead because
> variable lifetime is ending, the second dbg.value is unnecessary.
>
>
>
> This will also allow us to fix debug info for px in this example:
>
>  void __attribute__((optnone, noinline)) usevar(int *x) {}
>
>   int main(int argc, char **argv) {
>
>     int x = 42;
>
>     int *px = &x;
>
>     usevar(&x);
>
>     if (argc) usevar(px);
>
>   }
>
>
>
> Today, we emit a location for px like `DW_OP_breg7 RSP+12`, which gives it
> the incorrect value 42. This is because our DBG_VALUE instruction for px’s
> location uses a frame index, which we assume is in memory. This is not the
> case, px is not in memory, it’s value is a stack object pointer.
>
>
>
> Please reply if you have any thoughts on this proposal. Adrian and I
> hashed this out over Bugzilla, IRC, and in person, so it shouldn’t be too
> surprising. Let me know if you want to be CC’d on the patches.
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170906/4eb2500a/attachment.html>


More information about the llvm-dev mailing list