[llvm-dev] [RFC] DebugInfo: A different way of specifying variable locations post-isel

Wed May 27 04:12:25 PDT 2020

Hi llvm-dev@,

Time for an update on this work. The prototype I threw together seems
do be working well: I'm observing the sort of changes in variable
locations that I was expecting (more below). There's also new
location-maintenance required for several post-regalloc passes, which
adds new complexity to calculating variable locations, but I'm
confident that it'll be worth it.

Firstly, to illustrate the benefits, I've attached two IR files from
functions in clang-3.4. I haven't reduced them as the point is that
they're "real world" code. For sample1.ll, an output-ptr-argument in
%agg.result is GEP'd, the result of which is the operand of a
dbg.value and a store:

  entry:
    [...]
    %2 = getelementptr %agg.result, i64 0, i32 2 [...]
    [...]
    %.cast.i.i = bitcast %union.anon* %2 to i8*
    call void @llvm.dbg.value(i8* %.cast.i.i, ...)
    store i8 0, i8* %.cast.i.i, align 8
    [...]

Instruction scheduling during SelectionDAG chooses to place the
'store' immediately after the GEP. The vreg for the store address is
then dead at the point where the DBG_VALUE uses it, and the location
is dropped by LiveDebugVariables, missing about five instructions of
coverage where the value is available in a register. This is hard to
fix in today's representation, because register allocation doesn't
guarantee anything about positions where a register is dead (AFAIUI).
The instruction referencing work successfully tracks which instruction
computes the GEPs Value, and picks a register location when
LiveDebugValues runs.

sample2.ll has this "cleanup" block, where %Current.146 is GEPed twice:

cleanup:
  %Current.146 = phi %"class.llvm::Use"* [blah]
  [...]
  %incdec.ptr10 = getelementptr %Current.146, i64 1
  call void @llvm.dbg.value(%incdec.ptr10, [...]
  [...]
  %Val = getelementptr %Current.146, i64 1, i32 2
  %4 = load i64, i64* %Val
  %5 = trunc i64 %4 to i32, !dbg !1558
  [...]

This doesn't suffer from liveness problems, but jumping to immediately
after PHI elimination, we get (on amd64):

  %9:gr64 = COPY killed %34:gr64
  %10:gr64 = nuw ADD64ri8 %9:gr64(tied-def 0), 24
  DBG_VALUE %10:gr64, $noreg, !"this",
  %26:gr32 = MOV32rm killed %9:gr64, 1, $noreg, 40

The ADD64ri8 is the first GEP in the IR above, the MOV32rm is the
second GEP and load folded together. While rewriting the tied-def
instruction, the two-address-instruction pass sinks the ADD64ri8 to
reduce register pressure, producing this:
  %9:gr64 = COPY killed %34:gr64
  DBG_VALUE %10:gr64, $noreg, !"this",
  %26:gr32 = MOV32rm %9:gr64, 1, $noreg, 40
  %10:gr64 = nuw ADD64ri8 killed %9:gr64(tied-def 0), 24

Where the DBG_VALUE refers to %10 before it's defined. During register
coalescing, %9 and %10 are merged, after which the DBG_VALUE refers to
the result of the COPY, which has the wrong value. The instruction
referencing work drops this wrong location. It could instead be
recovered as a debug use-before-def, however I haven't completely
implemented that yet.

That's the good news, that this functionally appears to work. Some
random remarks:
 * Variable coverage (as per llvm-locstats) is currently slightly
down, likely just due to bugs I've written in,
 * LiveDebugVariables does appear to work without equivalence classes,
in this use-case it only needs to track a register at one SlotIndex,
 * I haven't looked at compile times (yet),
 * I don't currently have a feeling as to whether variable coverage
will end up better or worse: bad locations being correctly dropped
might out-number newly preserved locations.

The bad news is the increased analysis required. It turns out
basic-block placement often uses tail duplication; and, if a
duplicated block used to contain a PHI location, this destroys the
SSA-like form I was relying on as there's no single-definition point.
Happily the SSAUpdater utility can easily patch this up, but it could
be expensive, and it's uncomfortable to use so late in compilation.
More on this some other time.

The modifications to LiveDebugValues I mentioned work too; but it has
become a reaching-definition analysis rather than a simpler dataflow
one. This is because it needed to be able to identify PHI locations,
but also because this whole idea hinges on being able to track values
at the end of codegen, and coverage was not sufficient without
tracking _all_ the locations a value may be in. Independently of the
instruction referencing work, the stronger LiveDebugValues'
performance matches current performance when stage2-building clang.
I'll write about this in a different email, later.

In conclusion: it's looking like this will work. The first change that
could land would be the LiveDebugValues modifications, which has some
independent benefits. I'll write about that in a separate email.

--
Thanks,
Jeremy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sample1.ll.gz
Type: application/gzip
Size: 24288 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200527/8ada52b5/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sample2.ll.gz
Type: application/gzip
Size: 22776 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200527/8ada52b5/attachment-0003.bin>