[LLVMdev] Improving the quality of debug locations / DbgValueHistoryCalculator

Tue Jun 23 16:41:43 PDT 2015

Here is a proposal for improving DbgValueHistoryCalculator and the
overall quality of debug locations.

Focus: This is about lowering the DBG_VALUE machine instructions to
DWARF location lists.

Non-focus: This is not about (typical -O0) variables that permanently
reside at a frame index and are described with dbg.declare intrinsics
in the IR. These variables are stored in the MMI side-table and
processed separately.

The semantics of DBG_VALUE
==========================

Examples:

	DBG_VALUE %EAX, %noreg, !"a", <<0x7fd4bbc17470>>; line no:3
        DBG_VALUE %RBX, 56, !"a", <<0x7ffd5ac3b3c0>>; line no:4 indirect
        DBG_VALUE 0, 0, !"a", <<0x7fd4bbc17470>>; line no:3

The DBG_VALUE machine instruction informs us that a variable (or a
piece of it) is henceforth stored in a specific location or has a
constant value. This is valid until the next DBG_VALUE instruction
describing the same variable or the location is being clobbered.

Lowering today
==============

1. DbgValueHistoryCalculator takes the MachineFunction graph and
   produces a list of live ranges for each variable (the
   DbgValueHistoryMap).

   Note: Variable live ranges are to be consumed by a debugger. They
   refer to the physical addresses and are agnostic of control flow.

   The live ranges produced by DbgValueHistoryCalculator are pairs of
   MachineInstructions, the first of which is always the DBG_VALUE
   describing the variable and location.

   The ranges are calculated as follows:
   - If the location is a register the range extends until the
     register is clobbered or until the end of the basic block.

   - If the location is a constant the range extends until the next
     DBG_VALUE instruction describing the same variable.

   - If the location is indirect and the register is not clobbered
     outside the function prologue and epilogue the the range is the
     entire function. This is a heuristic to make stack
     frame-allocated variables work better. Otherwise the range
     extends until the next DBG_VALUE or the end of the basic block.

2. The buildLocationList() function takes the list of ranges of one
   variable and builds a location list. A variable may consist of many
   pieces which have their own live ranges. A live ranges for a piece
   is split up so that no piece's live range starts or ends in the
   middle of another piece's live range.  Any two consecutive ranges
   with identical location contents are merged.  Labels are requested
   for start and end of each range.

3. The location list is finalized by lowering it into DWARF and
   emitted into a buffer.

Shortcomings with the current apporach
======================================

Problems with the current approach include inaccurate live ranges for
constant values, ranges ending too early (usually at basic block
boundaries), poor handling of frame-register-indirect variables that
were introduced by spill code, poor handling of variables that are in
more than one location at once.

A better DbgValueHistoryCalculator
==================================

Currently DbgValueHistoryCalculator does a very simple, linear pass
through all MachineInstructions, with a couple of heuristics to make
common cases such as the frame index variables work.

It would be possible to address many of the current problems by having
earlier passes emit many more DBG_VALUE instructions. There are
several problem with this approach though: It distributes the
complexity across many more passes and thus imposes a maintenance
burden on authors of prior passes, increases machine code size, and
stores a lot of redundant information.

What I'm proposing instead is to make DbgValueHistoryCalculator
smarter at creating ranges. The goal is to make hack^H^H^Heuristics such as
the frame index handling unnecessary by correctly propagating DBG_VALUE
liveness across basic block boundaries.

To illustrate what I mean I'll use a notation where
DbgValueHistoryCalculator is defined in terms of a data-flow analysis.

First a couple of data types used in the following pseudo code:
- A range is a pair of MachineInstructions (start, end) that both belong
  to the same basic block.
- range[var] is the final list of live ranges for a variable.
- open_ranges[BB][var] is the list of not-yet-terminated ranges for
  var inside the current basic block BB.
- outgoing_loc[BB][var] is the list of locations for var valid at
  the end of a basic block BB.

// This code is probably buggy because I didn't run it through a
// compiler yet, but I hope it serves to illustrate my point.

// Visit a machine instruction.
transfer(MInsn) {
  // A DBG_VALUE marks the beginning of a new range.
  if (MInsn is a DBG_VALUE(var, loc)) {

    for ((rvar, start) : open_ranges[BB])
      // A DBG_VALUE terminates a range started by a previous
      // DBG_VALUE for the same variable, if the described pieces
      // overlap.
      if (var == rvar && piece_overlaps(MInsn, start)) {
        ranges[var].push_back((start, MInsn));
        open_ranges[BB][rvar].remove(start));
      }
    open_ranges[BB][var].push_back(MInsn)
  }

  // A def of a register may mark the end of a range.
  if (MInsn is a def(reg)) {
    for ((var, start) : open_ranges[BB])
      if (start.loc == reg) {
        ranges[var].push_back((start, MInsn));
        open_ranges[BB][var].remove(start));
      }

  // End all ranges in the current basic block.
  if (MInsn.isTerminator())
    for (range : open_ranges[BB]) {
      ranges[var].push_back(range);
      open_ranges[BB].remove(range);
      outgoing_loc[BB][range.var].push_back(range.loc);
      changed = true;
   }
}

// Visit the beginning of a basic block BB with two predecessors.
join(BB, BB1, BB2) {
  for ((var1, loc1) : outgoing_loc[BB1])
    for (loc2 : outgoing_loc[BB2][var1])
      // It's only safe to propagate a range if all predecessors end
      // with the same location.
      if (loc1.kind == loc2.kind &&
          loc1.val  == loc2.val)
        // Not sure how to best implement this.
        open_ranges[BB][var].push_back(new DBG_VALUE(loc1));
}

analyze(MF) {
  for (BB : MF)
    workset.push_back(BB);

  while (!workset.empty()) {
    changed = false;
    for (MI : workset.pop_front())
      transfer(MI)
    if (changed)
      workset.append(BB.successors())
   }
}

A couple of observations about the pseudo-code above: The analysis
terminates, because ranges is write-only, making this effectively a
bit-vector problem. It is also safe, because we are only propagating
ranges into the next basic block if all predecessors have the same
outgoing location for a variable piece.

One problem that is not addressed by the approach above is how to
become better at handling variables that are in more than one location
at once: As Keno noted on llvm-commits recently, the fundamental
problem is that DbgValueHistoryCalculator cannot safely distinguish
between a DBG_VALUE that provides an additional valid location and one
describing an updated location for a variable. IMO this is best
addressed on a case-by-case basis in the pass that introduces the second
DBG_VALUE, either by marking it as alternative location or not emitting
it at all, but I’m open for suggestions.

Comments?

-- adrian