[PATCH] Rewrite calculateDbgValueHistory to make it (hopefully) more transparent.

Wed May 7 10:58:03 PDT 2014

================
Comment at: lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.cpp:32
@@ +31,3 @@
+static Optional<unsigned> isDescribedByReg(const MachineInstr &MI) {
+  if (!MI.isDebugValue())
+    return None;
----------------
Is this ever called with a non_DBG_VALUE? It doesn't look like it. Should this just be an assertion?

================
Comment at: lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.cpp:37
@@ +36,3 @@
+  // the location of variable is unknown.
+  if (!MI.getOperand(0).isReg() || MI.getOperand(0).getReg() == 0U)
+    return None;
----------------
Is the first operand ever not a register? I thought that was an invariant of these intrinsics.

So it looks like isDescribedByReg boils down to "return MI.getOperand(0).getReg()", and some assertions?

(assuming the return type changes to "unsigned" rather than "Optional<unsigned>" - correctly returning 0/false when there is no register use (ie: this is a range terminator))

================
Comment at: lib/CodeGen/AsmPrinter/DbgValueHistoryCalculator.cpp:140
@@ +139,3 @@
+    const auto LastMBBInstr = MBB.getLastNonDebugInstr();
+    bool SeenLastMBBInstr = LastMBBInstr == MBB.end();
+
----------------
Alexey Samsonov wrote:
> David Blaikie wrote:
> > This variable and its use seem confusing to me.
> > 
> > So the following loop loops over all MI in the MBB, does work if it's a debug_value, unless it's a trailing debug_value (a debug_value after all non-debug_value instructions in the block).
> > 
> > Do we have a lot of those? Is that worth a special case? I mean we should certainly prune zero-length location ranges, but they could arise from non-trailing debug_value anyway, right?
> > 
> Yes, this happens. Suppose we have the following machine basic block:
> 
>   <begin MBB>
>   %R15 = ...
>   DBG_VALUE %R15, %noreg, !"var"
>   <end MBB>
> 
> Here we don't need to add zero-length location range for "var", and it's easier to just drop the range at this point.
> 
> In the history produced by this algorithm, location range is started by DBG_VALUE instruction and is terminated by some following instruction. If we start the new range by adding 
>   DBG_VALUE %R15, %noreg, !"var"
> to the history, how do we terminate it? Nothing follows it in the same machine basic block.
> 
"easier" in terms of computational work, but not necessarily easier in terms of being able to understand this algorithm, right?

I'd like to make this as legible as possible, because it is rather confusing. The comment you opened this patch with described a fairly clear algorithm, but that's not visible to me in the code, which is really unfortunate.

That being said, I know this is just a precursor to other refactoring, so I shouldn't dwell on making this "perfect".

The issue you describe here where there's no way to terminate an instruction would come up with intermediate values too:

  DBG_VALUE %R15, %noreg, !"var"
  DBG_VALUE %noreg, 0, !"var"
  ...

While I realize that's probably not a thing we ever create, it seems like having the algorithm tolerate that general case would probably make it more legible. (I don't know for sure, I haven't written it out that way)

Plucking ideas out of the air, maybe having a "pending start of range" list that is built whenever a DBG_VALUE is seen, but isn't added to the real list until a non-DBG_VALUE instruction is seen (that instruction would then form the start of the range).

That way the above example falls out by replacing (Or removing, in the case of the end-of-range %noreg) elements in the pending list, then throwing out the whole pending list when we get to the end, since anything in there never saw an instruction to start the range at. (and if we do see an instruction we build the start-of-real-range for each element in the pending list and clear out the pending list)

Alternatively if ranges can cover DBG_VALUE intrinsicts (since they start at the intrinsic, rather than the next instruction) - what actually breaks if they /only/ cover DBG_VALUE intrinsics? (can we just build these ranges, even if they're silly - to simplify the algorithm - or is there something that breaks if we do?)

http://reviews.llvm.org/D3597