[llvm-dev] RFC: Introduce DW_OP_LLVM_memory to describe variables in memory with dbg.value

Thu Sep 7 09:46:23 PDT 2017

On Wed, Sep 6, 2017 at 5:37 PM Reid Kleckner <rnk at google.com> wrote:

> On Wed, Sep 6, 2017 at 5:01 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>> On Wed, Sep 6, 2017 at 2:01 PM Reid Kleckner <rnk at google.com> wrote:
>>
> On Wed, Sep 6, 2017 at 10:01 AM, David Blaikie <dblaikie at gmail.com> wrote:
>>>
>>>> I guess you described this already, but talking it through for
>>>> myself/maybe others will find this useful:
>>>>
>>>> So since we don't have DW_OP_regN for LLVM registers, we could sort of
>>>> assume the implicit first value on the stack is a pseudo-OP_regN of the
>>>> LLVM SSA register.
>>>>
>>>
>>> Yep, that's how we use DIExpressions in both IR and MIR: The LHS of the
>>> dbg.value and DBG_VALUE instructions are a register-like value that gets
>>> pushed onto the expression stack. The DWARF asmprinter does some expression
>>> folding to clean things up, but that's the model.
>>>
>>>
>>>> To support that, all existing uses would need no changes to match the
>>>> DWARF model of registers being implicitly direct values.
>>>>
>>>> Code that wanted to describe the register as containing the memory
>>>> address of the interesting thing would use DW_OP_stack_value to say "this
>>>> location description that is a register is really an address you should
>>>> follow to find the value, not a direct value itself"?
>>>>
>>>> But code that wanted to describe a variable as being 3 bytes ahead of a
>>>> pointer in an LLVM SSA register would only have "plus 3" in the expression
>>>> stack, since then it's no longer a direct value but is treated as a pointer
>>>> to the value. I guess this is where the ambiguity would come in - currently
>>>> how does "plus 3" get interpreted when seen in LLVM IR, I guess that's
>>>> meant to describe reg value + 3 as being the immediate value of the
>>>> variable? (so it's implicitly OP_stack_value? & OP_stack_value is added
>>>> somewhere in the DWARF backend?)
>>>>
>>>
>>> Our model today is inconsistent.
>>>
>>
>> Inconsistent between what and what? LLVM and DWARF? Yeah, I guess there's
>> some mismatch between the semantics, though I'm still having trouble
>> wrapping my head around it.
>>
>
> I mean LLVM's model is internally inconsistent. We have bugs like the RVO
> one that you filed (https://llvm.org/pr34513), where we forget if the
> debug value is an address or a value.
>

Feels to me like bugs rather than inconsistencies (I'd think of an
inconsistency as "we do X over here intentionally and Y over here
intentionally but they're in contradiction to one another")

>
> In LLVM IR today, the SSA value of the dbg.value *is* the interesting
>>> value, it is not the address, and we typically use empty DIExpressions. If
>>> the value is ultimately register allocated and the DIExpression is empty,
>>> we will emit a DW_OP_regN location expression. If the value is spilled, we
>>> usually don't need to append DW_OP_stack_value because the location is now
>>> a memory location, which can be described by DW_OP_[f]breg.
>>>
>>> Today, passes that want to add "plus 3" to a DIExpression go out of
>>> their way to add DW_OP_stack_value to the DIExpression because the backend
>>> won't do it for us, even though dbg.value normally describes the value, not
>>> an address.
>>>
>>> To explore the alternative DW_OP_stack_value model, here's how I'd go
>>> about it:
>>> 1. Replace llvm.dbg.value with new intrinsic, llvm.dbg.loc, to make the
>>> semantic change clear. It can express both an address or a value, depending
>>> on the DIExpression.
>>> 2. Auto-upgrade llvm.dbg.value to llvm.dbg.loc. Append DW_OP_stack_value
>>> to the DIExpression argument of the intrinsic.
>>> 3. Auto-upgrade llvm.dbg.declare to llvm.dbg.loc, leave the DIExpression
>>> alone. The LHS of llvm.dbg.declare is already the address of the variable.
>>> 4. Eliminate the second operand of DBG_VALUE MachineInstrs. Indirect
>>> DBG_VALUES are now expressed with a DIExpression that lacks
>>> DW_OP_stack_value at the end.
>>> 5. Teach our DWARF expression emitter to combine the new expressions as
>>> necessary. In particular, we can elide DW_OP_stack_value for DBG_VALUEs
>>> with physical register operands. They just use DW_OP_regN, which is
>>> implicitly a value location.
>>> 6. Teach all passes that spill virtual registers used by DBG_VALUE to
>>> remove DW_OP_stack_value from the DIExpression, or add DW_OP_deref as
>>> appropriate.
>>>
>>> This should be equivalent to DW_OP_LLVM_memory, and more inline with
>>> DWARF location expression semantics, but it has a large migration cost.
>>>
>>> ---
>>>
>>> I think part of the reason I wanted to move in the DW_OP_LLVM_memory
>>> direction is that I originally wanted to add a memory offset operand to it.
>>> Our actual use cases for complex DWARF expressions typically come from
>>> things like safestack, ASan, and blocks. What these all have in common is
>>> that they gather up a number of variables and move them off into a struct
>>> in heap memory. This is very similar to what happens when we spill a
>>> virtual register: instead of describing a register, we modify the
>>> expression to load the value from some FP register with an offset. I think
>>> the right representation for these transforms is basically a "chain of
>>> loads".
>>>
>>
>> Don't think I've got any mental model of what you mean by this phrase
>> ('chain of loads') - could you provide an example or the like?
>>
>
> Suppose you have a captured variable with __block shared storage, and then
> suppose you compile it with ASan and safestack, and then the safestack
> pointer is spilled. To compute the value, the debugger starts from a
> register, goes to an offset, and loads a pointer, repeating the process
> until it finds the value. As we proceed through codegen, we effectively
> build up the chain.
>
> 1. To implement __block in Clang, we use dbg.declare(%block_descriptor,
> DW_OP_deref, DW_OP_constu_plus $offset)
>

Guessing it's the other way (constu_plus(offset), deref)? But in any case...

> 2. Assuming the block descriptor lived in an alloca (which it doesn't, but
> assume it does for argument's sake), asan will move that alloca onto the
> heap to implement use-after-return detection. It will prepend "DW_OP_deref,
> DW_OP_constu, $offset" to the DIExpression.
> 3. If ASan put its value in an alloca and safestack wanted to move that
> alloca to the safe stack (again bear with me), it would do the same:
> prepend deref+offset.
> 4. Finally, spilling the safe stack pointer to the control stack would
> mean prepending deref+offset.
>
> This seems like a really common pattern. Right now this offsetting and
> loading has to be expressed as separate location expression opcodes. The
> DW_OP_deref opcode functions like a load sequencing operation that can only
> appear between two offsets, although an offset could be zero, in which case
> there would be no DW_OP_constu_plus opcode. I'm suggesting we move to a
> representation where the offset and the deref are one.
>

Seems like a size optimization more than anything? Which I could get behind
if there is data to support the optimization as being significantly
valuable (& ideally I'd probably favor the general representation first,
etc - but I understand if the special-cased representation has other side
benefits (like reducing the cost to add support, etc) it might be valuable,
but trades off with the "more special cases/non-DWARFian things in our
pseudo-DWARF IR")

> Think "semicolon as sequencing operator" vs. "semicolon as statement
> terminator". When we use DW_OP_LLVM_memory or DW_OP_deref, the variable
> must always live in memory, we're always doing address calculation. We
> should try to make our representation more closely match the set of things
> we actually want to do and support.
>
> That's kind of the gist of what I had in mind. I didn't think it was worth
> it, which is why I pared the proposal down to just "the opposite of
> DW_OP_stack_value".
>
>
>> I was imagining that DW_OP_LLVM_memory with an offset would be that load
>>> chain link.
>>>
>>> The idea behind this representation is that it should make it easy for
>>> spilling transforms to prepend a load chain onto the expression, rather
>>> than them having to individually discover if DW_OP_deref is needed, or call
>>> some common helper like DIExpression::prepend. It should always be valid to
>>> push on a new load with an offset.
>>>
>>
>> When would that not be valid today/without LLVM_memory? Sorry, again -
>> it's all a bit fuzzy in my head.
>>
>> There'd be some canonicalization opportunities, but not seeing the
>> correctness issues with being able to prepend onto the location list -
>>  seems like that might be true with LLVM_memory too... maybe?
>>
>
> The correctness issue with today's prepending of offsets and deref is that
> it's hard to know when to insert deref, because we don't know if an
> expression describes an address or a value. We have code like this in
> buildDbgValueForSpill:
>   // If the DBG_VALUE already was a memory location, add an extra
>   // DW_OP_deref. Otherwise just turning this from a register into a
>   // memory/indirect location is sufficient.
>   if (IsIndirect)
>     Expr = DIExpression::prepend(Expr, DIExpression::WithDeref);
>

Sure - this goes back to my adding the "indirect" flag to support C++
non-trivial pass by value in C++, before we had all the more general
expression support, etc. (hilariously, what was there before was even more
awesome: we encoded the type of the parameter in the DWARF as T& instead of
T... literally changing the signature of the function... )

Either LLVM_memory or stack_value approaches would remove the prepending
issues & I agree getting rid of the ad-hoc/separate 'indirect' flag is
good, just haggling over how best to do that.

> We modify DBG_VALUEs for spills in several other places in codegen and
> they don't all correctly insert DW_OP_deref. The load chain representation
> should make it easy to just modify the offset on the front or add a new
> load depending on what's being done.
>

But I /think/ a stack_value based approach would allow that too, right? I'm
probably missing something.

>
>
>> It also has the advantage that it will be easier to translate to CodeView
>>> than arbitrary DWARF expressions, which we are currently canonicalizing
>>> into a load chain and then attempting to emit.
>>>
>>
>> *nod* my worry is ending up with 3 different representations - DWARF,
>> CodeView, and the increasingly divergent IRDWARF (especially since it's
>> "sort of like DWARF" which makes the few divergences more costly/difficult).
>>
>>
>>> Does that make sense? I'm starting to feel like I should either pursue
>>> the more ambitious load chain design,
>>>
>>
>> What would that look like?
>>
>
> Just `DW_OP_LLVM_memory, 8, DW_OP_LLVM_memory, 20, ...` in DIExpression
> through IR. It's OK to insert more DWARF opcodes between the links, it's
> just non-canonical if they are pointer offsetting opcodes that could be
> folded into the memory opcode. The DWARF expression backend would fold it
> into the same location expressions we have today.
>
>
>> or consistently apply DW_OP_stack_value to llvm.dbg.loc (alternative
>>> names welcome).
>>>
>>
>> Would have to think some more - maybe there's a way to avoid the rename?
>> But yeah, don't have a problem with llvm.dbg.loc - as you say/implied, it'd
>> match the new semantics better.
>>
>
> I don't think so. =/ I think the rename is the only safe way to maintain
> bitcode compatibility.
>
>
>> But really, your original proposal's probably OK/close enough to go
>> ahead. I don't feel that strongly.
>>
>
> Makes sense. That's basically where I ended up, but now I'm reconsidering
> the dbg.loc+DW_OP_stack_value thing, to bring DIExpressions closer to DWARF.
>

*nod* I'm not sure either way,  maybe need to come back to a summary of
this again after going through these discussions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170907/6dae4e2a/attachment.html>