[PATCH] IR: Move the complex address expression field out of DIVariable and into an extra argument of the dbg.declare/dbg.value intrinsics.

Tue Aug 26 17:19:18 PDT 2014

>>! In D4919#11, @dblaikie wrote:
>>>! In D4919#10, @aprantl wrote:
>> If we also bump the metadata version number with this patch, the debug info will be stripped. Originally I thought that AsmParser would still throw an error because the intrinsics are missing an argument, but this is not the case, I just verified that. So this is good, that removes the autoupgrade requirement!
> 
> Excellent!
> 
>> My rationale for DW_TAG_user_lo is that this way we can guarantee that all DIDescriptor tags (except line table entries) are within the range [DW_TAG_array_type | LLVM_DebugVersion .. DW_TAG_user_hi|LLVM_DebugVersion].
> 
> In what way is this beneficial? I suppose it helps us reserve different namespaces for different metadata annotations.

Exactly!

> Are we annotating any other metadata? How are we identifying that metadata? If we are, I assume we'd want an overall scheme, but I doubt we are.

Currently we aren't (in fact, TBAA is the *only* other kind of metadata that I'm aware of), but I think it would be nice if we could in the future.
The -blocks testcases are a nice example of just how useful having a human-readable comment can be.

If you feel that using the DW_TAG_user range has too much of a potential for conflicts, we could also allocate an adjacent range.

> 
>> My current suggestion is to use the end of the DW_TAG_user range for this purpose:
>> ```
>>     enum ComplexAddrKind {
>>       OpFirst = dwarf::DW_TAG_hi_user - dwarf::DW_OP_hi_user,
>>       OpPlus  = OpFirst + dwarf::DW_OP_plus,
>>       OpDeref = OpFirst + dwarf::DW_OP_deref,
>>       OpPiece = OpFirst + dwarf::DW_OP_piece,
>>       OpLast = OpPiece
>>     };
> 
> The reason I shy away from this is that I think, as I mentioned, it confuses what these values are - they're not tags, they're just identifiers. (I mean I suppose the same is true of DW_TAG_arg_variable/param_variable or whatever it is we added that are non-standard 'tags' - but that's a bit more justified because they're in the same part of the schema - they have to be unique there/in the same range)
> 
>> 
>> ```

================
Comment at: include/llvm/CodeGen/MachineModuleInfo.h:394
@@ -392,3 +393,3 @@
   /// information of a variable.
-  void setVariableDbgInfo(MDNode *N, unsigned Slot, DebugLoc Loc) {
-    VariableDbgInfo Info = { N, Slot, Loc };
+  void setVariableDbgInfo(MDNode *Expr, MDNode *Var,
+                          unsigned Slot, DebugLoc Loc) {
----------------
dblaikie wrote:
> I'd probably put the Var at the start - sort of the broadest thing first. But it's not a strong preference & certainly no great argument from consistency. (& I won't repeat this feedback for all the various similar APIs that have the same ordering choice)
Whichever way we go, we _have_ to be consistent. The fact that both are MDNode pointers is just asking for bugs.
I went with this order to be consistent with the intrinsic
llvm.dbg.value(%val, i64 offset, !metadata Var)
which I extended to 
(1) llvm.dbg.value(%val, i64 offset, !metadata Expr, !metadata Var)

If we go with your suggestion, it would be better to also change the intrinsic to
(2) llvm.dbg.value(%val, i64 offset, !metadata Var, !metadata Expr)
but that is inconsistent with the position of offset, unless we change it to 
(3) llvm.dbg.value(%val, !metadata Var, i64 offset, !metadata Expr)

[That said, one day in the future we may want to roll the offset into Expr, so maybe we can skip (3) as an intermediate step.]

My preference is (1) simply because it is consistent with the previous intrinsic.

================
Comment at: include/llvm/IR/DebugInfo.h:722
@@ +721,3 @@
+/// DIExpression - A complex location expression.
+class DIExpression : public DIDescriptor {
+  friend class DIDescriptor;
----------------
dblaikie wrote:
> Does this benefit/need to be a DIDescriptor? I suppose in dumping code it might be handy (?) but generally we're never going to have a generic DIDescriptor and have to ask if it's a DIExpression. There's only one place we'll find them while navigating the debug info metadata - and that one place there won't be anything else we expect to find.
You answered that question yourself :-)
Yes, it is necessary for dump(), and dump() is essential for the testsuite.

================
Comment at: include/llvm/IR/Metadata.h:34
@@ -33,3 +33,3 @@
 enum LLVMConstants : uint32_t {
-  DEBUG_METADATA_VERSION = 1  // Current debug info version number.
+  DEBUG_METADATA_VERSION = 2  // Current debug info version number.
 };
----------------
dblaikie wrote:
> Did we ever figure out the answer to rolling the version number? I assume this is one of the reasons you have so much test fallout - is there any way we can commit this change separately? (I realize that'll leave a revision or two that don't make sense) just from a review-ability perspective? (not that I've looked over the tests - but it'd possibly be nicer to be able to at least eyeball a few that actually have schema changes versus those that just need the revision number updated)
IMO we have to bump the version number together with a change that changes the metadata layout. Otherwise, if we had
r1 - base
r2 - update metadata
r3 - bump version

users that LTO a module compiled with r1 and a module compiled with r2 will get an assertion/crash. Now, you might argue that we don't need to cater for users living off trunk, but it would also be hard to tell which commits warrant a version bump when, e.g, cherry-picking to an llvm-release branch. Whatever consensus we reach, please let's document it somewhere :-)

================
Comment at: lib/CodeGen/AsmPrinter/AsmPrinter.cpp:619
@@ -618,3 +618,3 @@
 
-  DIVariable V(MI->getOperand(2).getMetadata());
+  DIVariable V = MI->getDebugVariable();
   if (V.getContext().isSubprogram()) {
----------------
dblaikie wrote:
> This looks like cleanup (could be a separate, precursor patch), no?
Agreed!

================
Comment at: lib/IR/DIBuilder.cpp:1087
@@ +1086,3 @@
+/// createExpression - Create an empty DIExpression.
+DIExpression DIBuilder::createExpression() {
+  return DIExpression(MDNode::get(VMContext, ArrayRef<Value *>()));
----------------
dblaikie wrote:
> What's this for? (I haven't seen any callers)
> 
> If needed, I'd just implement it by having the above function default the argument to None:
> 
> createExpression(Arrayref<Value *> Addr = None)
Ah, that does look much better, yes! Thanks.

http://reviews.llvm.org/D4919