[llvm-dev] RFC: Unify debug and optimized variable locations with llvm.dbg.addr [was: DW_OP_LLVM_memory]

Fri Sep 8 09:44:12 PDT 2017

On Thu, Sep 7, 2017 at 10:46 AM Reid Kleckner <rnk at google.com> wrote:

> I chatted with Chandler in person and came up with what I think is a much
> simpler design than my previous design in the thread titled "RFC: Introduce
> DW_OP_LLVM_memory to describe variables in memory with dbg.value".
>
> The crux of the problem in that thread is that we need a representation,
> particularly in the middle-end, to describe a variables address, at a
> particular program point.
>
> The reason we can't just use dbg.declare (at least as specified today) is
> that it doesn't have this "particular program point" property:
> 1. Transforms assume that allocas will have at most one dbg.declare use.
> 2. dbg.declares of static allocas are used to fill in a variable frame
> index side table. They are not translated into DBG_VALUE machine
> instructions that have real program points, so we can't mix them with
> dbg.value. A value cannot be `i32 42` at one point, and then be back in
> memory later.
> 3. The name suggests that it's a declaration, and we probably shouldn't
> try to re-declare a variable twice.
>
> The backend actually already has this in
> MachineInstr::isIndirectDebugValue(). If that returns true, it means that
> the result of evaluating the DIExpression on the MachineOperand will be the
> address of the local variable. The middle-end lacks this ability.
>
> Rather than futzing around with the DIExpression on dbg.value to encode
> this bit of information (address or value), we can encode it in the
> intrinsic name.
>
> This should handle both of the examples from the original RFC (
> http://lists.llvm.org/pipermail/llvm-dev/2017-September/117141.html).
>
> For this case:
>   int x = 42;  // Can DSE
>   dostuff(x); // Can propagate 42
>   x = computation();  // Post-dominates `x = 42` store
>   escape(&x);
>
> After DSE, we should get:
>   int x; // dbg.value(!"x", 42)
>   dostuff(42);
>   x = computation();
>   // dbg.addr(!"x", &x)
>   escape(&x);
>
> The new capability illustrated here is that dbg.addr can move and when it
> moves its position matters and affects the final location list.
>
> Another example:
>   int x = 1; // Clang -O0 emits dbg.addr(!"x", %x.addr)
>   int y = 1;
>   escape(&x, &y);
>   x = 2; // DSE will delete, inserting dbg.value(!"x", i32 2)
>   y = 2; // Cannot delete, user may break here to observe x
>   x = 3; // DSE will insert dbg.addr(!"x", %x.addr) before the killing
> store
>   escape(&x, &y);
>
> This shows that there may be multiple llvm.dbg.addr intrinsic calls.
>
> One complication is that we *don't* want to allow dbg.addr calls to *move*
> the location of a variable from one memory location to another.
>

That seems pretty restrictive...

> So, for the same concrete local variable, all dbg.addr calls should agree
> on the memory location. This simplifies mem2reg SSA promotion, so that it
> doesn't have to care about the program ordering of dbg.value and dbg.addr
> calls.
>

I feel like maybe that's worth supporting - but I guess it can be done
later if it becomes useful/isn't needed right now?

> If we allowed this, mem2reg would have to consider inputs that look like
> this:
>   %x.addr = alloca i32
>   %y.addr = alloca i32
>   call void llvm.dbg.addr(i32* %x.addr, !"x")
>   store i32 0, i32* %x.addr
>   %v = load i32, i32* %x.addr
>   store i32 %v, i32* %y.addr ; copy the variable
>   call void llvm.dbg.addr(i32* %y.addr, !"x")  ; x now lives in y
>   store i32 42, i32* %x.addr ; mem2reg would translate this to
> dbg.value(i32 42, !"x"), which is incorrect
>

> mem2reg should only have to worry about the program order of alloca loads
> and stores to form SSA. It shouldn't also have to reason about concrete
> variables.
>
> ----
>
> In the short term, we can add this intrinsic, make instcombine use it, and
> lower it to indirect DBG_VALUE machine instrs. That should be a bug spotfix.
>
> Once we do that, we can make more optimization passes dbg.addr aware.
> Obviously, mem2reg needs to handle it in mostly the same way that it
> handles dbg.declare today. It just needs to remove possibly multiple
> dbg.addrs while today it removes only one dbg.declare.
>
> Then, we can add a flag to transition clang from dbg.declare to dbg.addr.
> Now -O0 code will emit indirect DBG_VALUE instructions. We'll have to fix
> the backend to cope better with these, so that we emit location lists that
> are equivalent to what we get today from the frame index side table.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170908/7ed1614d/attachment.html>