<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Sep 7, 2017, at 10:46 AM, Reid Kleckner <<a href="mailto:rnk@google.com" class="">rnk@google.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">I chatted with Chandler in person and came up with what I think is a much simpler design than my previous design in the thread titled "RFC: Introduce DW_OP_LLVM_memory to describe variables in memory with dbg.value".<div class=""><br class=""></div><div class="">The crux of the problem in that thread is that we need a representation, particularly in the middle-end, to describe a variables address, at a particular program point.</div><div class=""><br class=""></div><div class="">The reason we can't just use dbg.declare (at least as specified today) is that it doesn't have this "particular program point" property:</div><div class="">1. Transforms assume that allocas will have at most one dbg.declare use.</div><div class="">2. dbg.declares of static allocas are used to fill in a variable frame index side table. They are not translated into DBG_VALUE machine instructions that have real program points, so we can't mix them with dbg.value. A value cannot be `i32 42` at one point, and then be back in memory later.</div><div class="">3. The name suggests that it's a declaration, and we probably shouldn't try to re-declare a variable twice.</div></div></div></blockquote><div><br class=""></div>All agreed.</div><div><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><br class=""></div><div class="">The backend actually already has this in MachineInstr::isIndirectDebugValue(). If that returns true, it means that the result of evaluating the DIExpression on the MachineOperand will be the address of the local variable. The middle-end lacks this ability.</div><div class=""><br class=""></div><div class="">Rather than futzing around with the DIExpression on dbg.value to encode this bit of information (address or value), we can encode it in the intrinsic name.</div><div class=""><br class=""></div><div class="">This should handle both of the examples from the original RFC (<a href="http://lists.llvm.org/pipermail/llvm-dev/2017-September/117141.html" class="">http://lists.llvm.org/pipermail/llvm-dev/2017-September/117141.html</a>).</div><div class=""><br class=""></div><div class="">For this case:</div><div class=""><div class=""> int x = 42; // Can DSE</div><div class=""> dostuff(x); // Can propagate 42</div><div class=""> x = computation(); // Post-dominates `x = 42` store</div><div class=""> escape(&x);</div></div><div class=""><br class=""></div><div class="">After DSE, we should get:</div><div class=""> int x; // dbg.value(!"x", 42)</div><div class=""> dostuff(42);</div><div class=""> x = computation();</div><div class=""> // dbg.addr(!"x", &x)</div><div class=""> escape(&x);</div><div class=""><br class=""></div><div class="">The new capability illustrated here is that dbg.addr can move and when it moves its position matters and affects the final location list.</div><div class=""><br class=""></div><div class="">Another example:</div><div class=""> int x = 1; // Clang -O0 emits dbg.addr(!"x", %x.addr)</div><div class=""> int y = 1;</div><div class=""> escape(&x, &y);</div><div class=""> x = 2; // DSE will delete, inserting dbg.value(!"x", i32 2)</div><div class=""> y = 2; // Cannot delete, user may break here to observe x</div><div class=""> x = 3; // DSE will insert dbg.addr(!"x", %x.addr) before the killing store</div><div class=""> escape(&x, &y);<br class=""></div><div class=""><br class=""></div><div class="">This shows that there may be multiple llvm.dbg.addr intrinsic calls.</div><div class=""><br class=""></div><div class="">One complication is that we *don't* want to allow dbg.addr calls to *move* the location of a variable from one memory location to another. So, for the same concrete local variable, all dbg.addr calls should agree on the memory location. </div></div></div></blockquote><div><br class=""></div><div>Should this be enforced by the verifier?</div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="">This simplifies mem2reg SSA promotion, so that it doesn't have to care about the program ordering of dbg.value and dbg.addr calls. If we allowed this, mem2reg would have to consider inputs that look like this:</div><div class=""> %x.addr = alloca i32</div><div class=""> %y.addr = alloca i32<br class=""></div><div class=""> call void llvm.dbg.addr(i32* %x.addr, !"x")<br class=""></div><div class=""> store i32 0, i32* %x.addr</div><div class=""> %v = load i32, i32* %x.addr</div><div class=""> store i32 %v, i32* %y.addr ; copy the variable</div><div class=""> call void llvm.dbg.addr(i32* %y.addr, !"x") ; x now lives in y</div><div class=""> store i32 42, i32* %x.addr ; mem2reg would translate this to dbg.value(i32 42, !"x"), which is incorrect</div><div class=""><br class=""></div><div class="">mem2reg should only have to worry about the program order of alloca loads and stores to form SSA. It shouldn't also have to reason about concrete variables.</div></div></div></blockquote><div><br class=""></div>Is the restriction precisely that all dbg.addr for the same variable/!dbg combination must describe the same alloca?<br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><br class=""></div><div class="">----</div><div class=""><br class=""></div><div class="">In the short term, we can add this intrinsic, make instcombine use it, and lower it to indirect DBG_VALUE machine instrs. That should be a bug spotfix.</div><div class=""><br class=""></div><div class="">Once we do that, we can make more optimization passes dbg.addr aware. Obviously, mem2reg needs to handle it in mostly the same way that it handles dbg.declare today. It just needs to remove possibly multiple dbg.addrs while today it removes only one dbg.declare.<br class=""></div><div class=""><br class=""></div><div class="">Then, we can add a flag to transition clang from dbg.declare to dbg.addr. Now -O0 code will emit indirect DBG_VALUE instructions. We'll have to fix the backend to cope better with these, so that we emit location lists that are equivalent to what we get today from the frame index side table.</div></div>
</div></blockquote><br class=""></div><div>I think I like this proposal for all the reasons that were discussed in the previous thread.</div><div><br class=""></div><div>A couple of questions for my understanding:</div><div>Your examples didn't show it, but I assume that dbg.addr will still allow a DIExpression (for things similarly complicated as block captures)?</div><div>What exactly is allowed as the first parameter of a dbg.addr? Only allocas? Anything else?</div><div>Should an sret parameter or similar be described by a dbg.addr?</div><br class=""><div class="">thanks!</div><div class="">adrian</div></body></html>