[llvm-dev] [DebugInfo] A value-tracking variable location update
David Blaikie via llvm-dev
llvm-dev at lists.llvm.org
Fri Nov 6 11:10:42 PST 2020
Awesome to read how it's coming along - I'm mostly aside from the
debug location work, but had just one or two clarifying questions
On Fri, Nov 6, 2020 at 10:27 AM Jeremy Morse
<jeremy.morse.llvm at gmail.com> wrote:
>
> Hi debug-info folks,
>
> Time for another update on the variable location "instruction referencing"
> implementation I've been doing, see this RFC [0, 1] for background. It's now at
> the point where I'd call it "done" (as far as software ever is), and so it's a
> good time to look at what results it produces. And here are the
> scores-on-the-doors using llvm-locstats, on clang-3.4 RelWithDebInfo first in
> "normal" mode and then with -Xclang -fexperimental-debug-variable-locations.
> "normal":
>
> =================================================
> cov% samples percentage(~)
> -------------------------------------------------
> 0% 765406 22%
> (0%,10%) 45179 1%
> [10%,20%) 51699 1%
> [20%,30%) 52044 1%
> [30%,40%) 46905 1%
> [40%,50%) 48292 1%
> [50%,60%) 61342 1%
> [60%,70%) 58315 1%
> [70%,80%) 69848 2%
> [80%,90%) 81937 2%
> [90%,100%) 101384 2%
> 100% 2032034 59%
> =================================================
> -the number of debug variables processed: 3414385
> -PC ranges covered: 61%
> -------------------------------------------------
> -total availability: 64%
> =================================================
>
> With instruction referencing:
>
> =================================================
> cov% samples percentage(~)
> -------------------------------------------------
> 0% 751201 21%
> (0%,10%) 40708 1%
> [10%,20%) 44909 1%
> [20%,30%) 47544 1%
> [30%,40%) 41630 1%
> [40%,50%) 42742 1%
> [50%,60%) 56692 1%
> [60%,70%) 53796 1%
> [70%,80%) 64476 1%
> [80%,90%) 73836 2%
> [90%,100%) 74423 2%
> 100% 2123749 62%
> =================================================
> -the number of debug variables processed: 3415706
> -PC ranges covered: 68%
> -------------------------------------------------
> -total availability: 64%
> =================================================
>
> The first observation: a significant increase in the byte-coverage statistic,
> meaning that we're able to track variable locations for longer and across more
> code. This was one of the main aims of this work, having better tracking of
> the locations that we know. The increase of seven percentage points includes an
> additional two percentage points of entry-value locations. If we disable entry
> value production then the scope-bytes-covered statistic moves from 59% to 64%,
Was this meant to be "from 64% to 59%"?
How does that compare to the baseline no-entry-value number?
Could you give a quick summary of the distinction between "PC ranges
covered" and "total availability"?
> which is still a decent improvement.
>
> The next observation is that the ``total availability'' of variables hasn't
> changed. This isn't the fully story -- if you give an absolute name to every
> variable with a location in the clang binary, there are 6949 dropped locations
> and 22564 completely new locations, meaning roughly 1% of all variables in the
> program have changed, it's just hidden by the statistics rounding. More detail
> on the nature of the changes are below. I was hoping for more false locations
> to be dropped; it's quite likely that there are many more false locations
> dropped within variables that have more than one value, which aren't readily
> reflected in these statistics.
>
> A natural question is: are all these new locations wrong, and the dropped
> locations only dropped because of bugs? To address that, I picked 20 new
> locations and 20 dropped locations at random and analysed why they happened.
> The input samples can be found here [2], along with an llvm-reduce'd version of
> each IR file. I confirmed the reason for the new/dropped location in the
> reduced and original file, as llvm-reducing them can alter the reason why
> something is dropped or not. Of the new locations, we previously could not
> track the location because:
> * 14 DBG_VALUEs come after the vreg operand is out of liveness and are dropped
> by LiveDebugVariables.
> * 2 DBG_VALUEs are out of liveness and dropped by RegisterCoalescing
> out of conservativeness.
> * 2 DBG_VALUEs that appear before their operand is defined. This is out of
> liveness, instruction referencing saves them through preserving debug
> use-before-defs.
> * 2 DBG_VALUEs that are out of liveness after a branch, but the value is live
> down the other branch path.
>
> All of these locations can be tracked with instruction referencing because
> liveness is not a consideration, only availability in physical registers. 19 of
> the new locations were correct, while one tracked the right value but picked
> the wrong location for it, which I've now got a patch for.
>
> For the dropped locations:
> * 8 false locations are dropped, they used to refer to the wrong value because
> of a failure in register coalescing, see the body of [3].
Would these issues ^ show up/be testable with Dexter?
> * 3 locations are un-necessarily dropped when different subregisters are
> merged together in register coalescing.
> * 3 locations are un-necessarily dropped due to conservative tracking of PHI
> values (the code in D86814, can be fixed with more C++).
> * 2 of the sample didn't actually have a dropped location; instead they
> preserved an undef debug instruction in early-taildup, and my scripts picked
> this up as dropping a location.
> * 2 locations aren't tracked by InstrRefBasedLDV through a block that's
> out of scope, meaning the location never covers instructions that are in
> scope. VarLocBasedLDV is vulerable to this too, but MachineSink can drop a
> DBG_VALUE on the far side of the scope gap, saving the location. See
> "Limitations" below.
> * 2 locations dropped during tail duplication: one in early-taildup which
> I haven't tried to address yet (see "Limitations"), one in late taildup
> where a block containing only debug instructions isn't correctly duplicated.
>
> To summarise: all the new locations found were correct and not trackable by
> DBG_VALUE variable-location tracking, although there are some bugs in picking
> locations. Roughly half of the dropped locations are actual false locations,
> the other half are due to unimplemented or limited handling of optimisations in
> the instruction referencing code so far.
>
> This pretty much fufils the objective of this work: we're able to save a lot
> more variable locations through the register allocator because we don't have to
> be so conservative about liveness. Plus, the default behaviour of all
> optimisations now is to _drop_ a variable location, as opposed to the existing
> situation where after we leave SSA form, all bets are off.
>
> Another question is how much this costs in compile time: a clang-3.4 build
> using instruction referencing on my otherwise idle machine usually tracks
> within 2% of a normal build. This is IMO expected given the larger amount of
> debugging information being produced, and I haven't closely studied the
> performance of a whole build using instruction referencing yet, so it'll
> probably get better. A more recent change to InstrRefBasedLDV has added a big
> slowdown though, so I'm going to skip reporting any performance results for
> now.
>
> Current situation
> =================
>
> Some of this work has landed; I've got some patches up for review [4] that
> implement the core parts. I also have a long tail of tweaks and
> location-salvaging in a tree here [5] which just fleshes outs more optimisation
> passes and installs bugfixes. (Commits there are not written to be human
> consumable, alas). There are no fatal flaws in the design as far as I'm aware,
> although there are some annoyances (see "Limitations").
>
> The biggest problem is that this all relies on a new LiveDebugValues
> implementation that doesn't have sufficient test coverage yet, and is still
> Somewhat Experimental (TM). Given the number of times an unpleasant performance
> cliff has been found in VarLoc LiveDebugValues, it wants a long time to soak in
> before being deployed.
>
> Limitations
> ===========
>
> Here's a non-exhaustive list of known problems. None of them are fatal IMO,
> and have a small effect on variable availability:
> * Early tail duplication: like late tail duplication, this tears apart SSA
> information and can cause the same "Value" to be defined twice. This is
> solvable using the SSAUpdater utility, which early-taildup already uses.
> * Attaching a debug instruction number to a COPY instruction is highly
> undesirable because the COPY doesn't actually define a value, it just moves
> it between locations. At least one optimisation (X86 LEAtoMOV) transforms
> instructions into COPYs (LEA $rsp + 0 => COPY $rsp), which is unfortunate.
> This doesn't happen a lot though, and can be fixed by dropping a DBG_PHI
> of the COPYd register nearby. Plus it only happens post-regalloc, which
> makes it less of a problem.
> * Trivial def rematerialization: there's no pattern to rely on in how the
> register allocator rematerializes values, and so values can rematerialize
> in different registers dominating different parts of the CFG. It's hard to
> track the variable location after that, because it has multiple values in
> the eyes of InstrRefBasedLDV. My preference would be, seeing how these defs
> are effectively constants, to have the target describe such trivial defs
> in a DIExpression. That avoids having to track the location of a constant
> that we already know.
> * As mentioned in the "missing" variable locations list, gaps in lexical
> scopes can lead to locations not being propagated sufficiently far, a
> problem for both variable-location tracking solutions as documented in
> PR48091. However, using DBG_VALUEs to track variable locations can save a
> few of them because MachineSink can sink DBG_VALUEs over the scope gap,
> wheras instruction-referencing tries to rely on tracking debug
> use-before-defs which don't propagate across scope gaps. More on how to
> resolve this in PR48091.
>
> Next Steps
> ==========
>
> While this isn't ready for general use yet, it'd be great to get as much as
> possible into llvm-12 behind the -Xclang
> -fexperimental-debug-variable-locations flag. That eases the path to testing
> for consumers, which gives a greater chance of finding worst-case slowdowns in
> advance of instruction referencing being generally available.
>
> There's a decent amount of stuff under "Limitations" above that I can address,
> plus some performance profiling is still needed. I imagine the next best thing
> to do is add support for GlobalISel and some non-X86 backends (certain
> TargetInstrInfo hooks need to perform debug-info bookkeeping), which would make
> this all more appetising.
>
> [0] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139440.html
> [1] http://lists.llvm.org/pipermail/llvm-dev/2020-June/142368.html
> [2] https://github.com/jmorse/llvm-inst-ref-test-samples
> [3] https://reviews.llvm.org/D86813
> [4] https://reviews.llvm.org/D88898
> [5] https://github.com/jmorse/llvm-project/commit/0a702b967927d888bd222806252783359fc74d57
>
> --
> Thanks,
> Jeremy
More information about the llvm-dev
mailing list