[PATCH] D49572: [docs] Clarify role of DIExpressions within debug intrinsics

Mon Jul 23 16:22:03 PDT 2018

vsk added inline comments.

================
Comment at: docs/LangRef.rst:4605
+A DIExpression attached to a ``llvm.dbg.addr`` or ``llvm.dbg.declare``
+intrinsic describes the concrete location of a source variable. A debugger must
+be able to modify the variable via this location. Consequently, this
----------------
bjope wrote:
> Is it true that a debugger *must* be able to modify the variable for an `llvm.dbg.addr`? Any specific reason, or are we just trying to put limitations on the DIExpression in a `llvm.dbg.addr` intrinsic?
> 
> 
No, I'll walk this back. It's valid to describe a read-only memory location. After thinking about it some more, I don't think there's really an issue with DW_OP_stack_value inside of a llvm.dbg.addr either.

================
Comment at: docs/LangRef.rst:4613
+
+- If the value operand of the intrinsic is an implicit location, the
+  DIExpression is interpreted as if it contained ``DW_OP_stack_value``,
----------------
bjope wrote:
> vsk wrote:
> > rnk wrote:
> > > What makes value operands to dbg.value implicit or concrete in LLVM IR? Are SSA values from local instructions concrete, and constants implicit? We could describe that here.
> > Sure. The way I read it, it depends on the DIType of the described variable. The value operand is concrete iff it's is a pointer to an instance of that DIType. So, the value operand in dbg.value(const-ptr-null, "int *p") is implicit, but concrete in dbg.value(const-ptr-null, "int").
> > 
> > At least, that's the only consistent explanation I've thought of. I don't know how the backend actually determines this. IIUC D49454/D49520 is an example of the backend getting this wrong: it treats a pointer to a std::deque as the implicit location of the std::deque.
> My interpretation (with very little experience of `llvm.dbg.addr`) has been that `llvm.dbg.addr` is the IR version of an *indirect* DBG_VALUE. And `llvm.dbg.value` is the IR version of an *non-indirect* DBG_VALUE. At least that seems to be the difference in SelectionDAG.
> Afaict the first argument in a `dbg.value`, together with the DIExpression, describes the value of the variable. The first argument in `dbg.addr`, together with the DIExpression, describes the address of the variable. And I think the first argument in `dbg.value` should be treated as a value, and the first argument in `dbg.addr` should be treated as an indirect pointer.
> 
> A DIExpression might be used both in dbg.declare, dbg.addr, dbg.value, direct DBG_VALUE and indirect DBG_VALUE, and it could be both tricky and confusing how to interpret the DIExpression. Depending on which intrinsic that is used, or if the DBG_VALUE is direct/indirect, the DIExpression could have an implied DW_OP_stack_value, DW_OP_deref, at the end (or even at the front?).
> As it might be hard to understand this, improving the documentation is a really nice initiative!
> 
> One question is if we need to be able to indicate that there is an indirect value operand in a `dbg.value`. Or  isn't it enough that if you for example want to describe a variables !Y:s value as (X[0] + 5), then you need to include a DW_OP_deref such as
> ```
>   dbg.value(X, !Y, DIExpression(DW_OP_deref, DW_OP_constu 5, DW_OP_add))
> ```
> The above will become a direct DBG_VALUE since `dbg.value` is used. The DW_OP_deref is needed since by default the first argument in `dbg.value` is treated as a value and not a pointer. The variable will be described using an "implicit location" (DWARF terminology). 
> 
> Are you even saying that depending on !Y it might be wrong to have the DW_OP_deref here?
> 
> Btw, I think it is confusing to use "concrete" as terminology for the value operand. Isn't the question if the value operand is direct or indirect (if it is a value or a pointer)?
> 
My first response to @rnk here was incorrect: implicit vs. concrete is not the same distinction as direct vs. indirect. The latter is the relevant distinction and it has nothing to do with DIType.

I consider @bjope's description here to be the "common sense" one we all *thought* was correct: interpreting a dbg.value should give a direct value, and interpreting a dbg.{addr,declare} should give an indirect value. I'll update this patch to make those definitions precise.

Basically, there should be exactly one way to interpret a DIExpression, without any implicit DW_OP_stack_value or DW_OP_deref added based on the context of which intrinsic / what type of location you have. Once we land the fix in D49454 I think we'll either *actually* have that model or be really close. Right now there is some special magic with non-empty DIExpressions, but I hope to eliminate that.

https://reviews.llvm.org/D49572