[PATCH] D49572: [docs] Clarify role of DIExpressions within debug intrinsics

Fri Jul 20 06:56:06 PDT 2018

bjope added inline comments.

================
Comment at: docs/LangRef.rst:4605
+A DIExpression attached to a ``llvm.dbg.addr`` or ``llvm.dbg.declare``
+intrinsic describes the concrete location of a source variable. A debugger must
+be able to modify the variable via this location. Consequently, this
----------------
Is it true that a debugger *must* be able to modify the variable for an `llvm.dbg.addr`? Any specific reason, or are we just trying to put limitations on the DIExpression in a `llvm.dbg.addr` intrinsic?

================
Comment at: docs/LangRef.rst:4613
+
+- If the value operand of the intrinsic is an implicit location, the
+  DIExpression is interpreted as if it contained ``DW_OP_stack_value``,
----------------
vsk wrote:
> rnk wrote:
> > What makes value operands to dbg.value implicit or concrete in LLVM IR? Are SSA values from local instructions concrete, and constants implicit? We could describe that here.
> Sure. The way I read it, it depends on the DIType of the described variable. The value operand is concrete iff it's is a pointer to an instance of that DIType. So, the value operand in dbg.value(const-ptr-null, "int *p") is implicit, but concrete in dbg.value(const-ptr-null, "int").
> 
> At least, that's the only consistent explanation I've thought of. I don't know how the backend actually determines this. IIUC D49454/D49520 is an example of the backend getting this wrong: it treats a pointer to a std::deque as the implicit location of the std::deque.
My interpretation (with very little experience of `llvm.dbg.addr`) has been that `llvm.dbg.addr` is the IR version of an *indirect* DBG_VALUE. And `llvm.dbg.value` is the IR version of an *non-indirect* DBG_VALUE. At least that seems to be the difference in SelectionDAG.
Afaict the first argument in a `dbg.value`, together with the DIExpression, describes the value of the variable. The first argument in `dbg.addr`, together with the DIExpression, describes the address of the variable. And I think the first argument in `dbg.value` should be treated as a value, and the first argument in `dbg.addr` should be treated as an indirect pointer.

A DIExpression might be used both in dbg.declare, dbg.addr, dbg.value, direct DBG_VALUE and indirect DBG_VALUE, and it could be both tricky and confusing how to interpret the DIExpression. Depending on which intrinsic that is used, or if the DBG_VALUE is direct/indirect, the DIExpression could have an implied DW_OP_stack_value, DW_OP_deref, at the end (or even at the front?).
As it might be hard to understand this, improving the documentation is a really nice initiative!

One question is if we need to be able to indicate that there is an indirect value operand in a `dbg.value`. Or  isn't it enough that if you for example want to describe a variables !Y:s value as (X[0] + 5), then you need to include a DW_OP_deref such as
```
  dbg.value(X, !Y, DIExpression(DW_OP_deref, DW_OP_constu 5, DW_OP_add))
```
The above will become a direct DBG_VALUE since `dbg.value` is used. The DW_OP_deref is needed since by default the first argument in `dbg.value` is treated as a value and not a pointer. The variable will be described using an "implicit location" (DWARF terminology). 

Are you even saying that depending on !Y it might be wrong to have the DW_OP_deref here?

Btw, I think it is confusing to use "concrete" as terminology for the value operand. Isn't the question if the value operand is direct or indirect (if it is a value or a pointer)?

https://reviews.llvm.org/D49572