[llvm-dev] [RFC] A value-tracking LiveDebugValues implementation

Vedant Kumar via llvm-dev llvm-dev at lists.llvm.org
Thu Jun 25 14:33:25 PDT 2020


Thanks for sharing this RFC, Jeremy.

> On Jun 18, 2020, at 8:35 AM, Jeremy Morse <jeremy.morse.llvm at gmail.com> wrote:
> 
> Hi debuginfo-cabal,
> 
> tl;dr: Let's please consider using a new implementation of LiveDebugValues
> that produces richer information, might be slightly faster, but mostly will
> support the instruction referencing and value tracking paradigm from my RFC [0]
> rather than the location tracking that LiveDebugValues does today.
> 
> In that RFC, the main motivator is to treat variable locations a bit more like
> SSA, having variable location metadata refer to the instructions that define a
> value, and track the value moving around registers from there. This
> prototype [1] is an implementation of the "track" mechanism, where we produce
> value numbers for every 'def' in a function, and track them being moved around.
> Variable locations are then resolved and extended by following where the value
> numbers go.

Your earlier RFC sketches a plan for extending LiveDebugValues to handle instruction-referencing DBG_VALUEs. From http://lists.llvm.org/pipermail/llvm-dev/2020-February/139440.html <http://lists.llvm.org/pipermail/llvm-dev/2020-February/139440.html> --

>> How then do we translate this new kind of machine location into
>> DWARF/CodeView variable locations, which need to know which register
>> to look in? The answer is: LiveDebugValues [3]. We already perform a
>> dataflow analysis in LiveDebugValues of where values "go" after
>> they're defined: we can use that to take "defining instructions" and
>> determine register / stack locations. We would need to track values
>> from the defining instruction up to the DBG_VALUE where that value
>> becomes a variable location, after which it's no different from the
>> LiveDebugValues analysis that we perform today.

I had interpreted this as meaning: introduce transfer functions to walk from the referenced instruction until we reach the referencing DBG_VALUE, then create a VarLoc. IIUC, at that point the unmodified LiveDebugValues algorithm could proceed as usual.

Reading this RFC, I get the sense that that has several drawbacks: it doesn’t handle situations where the referenced instruction is in a different basic block than its referencing DBG_VALUE (all the ‘ugly phi stuff’ is missing!), and it doesn’t track all the locations for a variable. Is that correct?

> This matches, to an extent, the "Register Data Flow" / RDF graph analysis
> that recently moved from the Hexagon backend to CodeGen. I had a look, and
> it appears RDF computes much richer information than LiveDebugValues needs
> (specifically, all uses of values) and allocates for each value number, which
> is quite a bit of overhead. I decided it'd be simpler to make a new
> LiveDebugValues implementation, as we've had performance problems with
> LiveDebugValues in the past, and we don't need to track additional metadata
> for each value, only number them uniquely.
> 
> There are few independent reasons to adopt this new implementation, and I'm
> self-conscious about dropping a code-bomb on people. My questions for this
> email are:
> * Does the new implementation (described below) make sense, and

I think your results speak for themselves :).

> * If so, how would I go about landing this?

One thing I’m not yet clear on: does your prototype essentially implement the minimal extension to LiveDebugValues needed to handle instruction-referencing DBG_VALUEs? If not, should it?, or should we take the opportunity to remove limitations from the current LiveDebugValues algorithm?

As for how to land this, I propose:

- Move LiveDebugValues.cpp to lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
- Split out common transfer functions and other helpers into lib/CodeGen/LiveDebugValues/Common.{h,cpp}
- Finally, add lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
- Add a cl::opt like -instr-ref-dbg-values to allow selecting a DBG_VALUE flavor

> We could wait until there's a complete suite of patches for the instruction
> referencing work (I've got a working prototype, but not production-quality).
> On the other hand, this new LiveDebugValues implementation is probably ready
> for review now, works independently, and is very much the hardest part of
> instruction referencing, so it'd be nice to get it under-way earlier. My
> preferred outcome would be adding it as a parallel implementation to the
> current LiveDebugValues, enabled by a runtime switch, then start to land
> the rest of the instruction referencing work around it. This reduces the
> scope for further bit-rot, and as Vedant mentioned [2] this would all need
> to sit under a runtime-switch for a transition period anyway.

Yes, I’m in favor of landing work related to instruction-referencing DBG_VALUEs in small-ish independent patches, as they become ready. I’d very much like to keep the unmodified var-loc DBG_VALUE code paths intact and separated.

> 
> # Overview
> 
> Today's LiveDebugValues (explained in the file docu-comment [3]) concerns
> itself with variable /locations/ only, not the values that they contain. The
> implementation is very good at its job: it detects when a location is
> invalidated (i.e. clobbered), control flow joins are a simple matter of "Do
> the predecessors agree on the location?", and not much more work is needed.
> 
> Switching to tracking variable values is more complicated. Following the
> original proposal, for this pseudo-MIR code sequence:
> 
>    $rax = somedef
>    DBG_INSTR_REF reference-to-somedef
>    $rbx = COPY killed $rax
>    ; Variable location is $rax, then moves to $rbx
> 
> We need to track that:
> a) $rax contains a value defined by the "somedef" instruction,
> b) The DBG_INSTR_REF needs a location for the value somedef defines.
> c) That value from "somedef" is copied to rbx,
> 
> The new LiveDebugValues implementation doesn't have DBG_INSTR_REF yet, so
> I've written it to work with DBG_VALUEs too, to prove that it works just as
> well as the current implementation. It deals with code sequences such as:
> 
>    $rax = somedef
>    DBG_VALUE $rax
>    $rbx = COPY killed $rax
>    ; Variable location is $rax, then moves to $rbx
> 
> In which the same tracking has to occur, but the DBG_VALUE "reads" a value
> number from $rax. As the variable locations are determined by the value number
> that the variable has, it doesn't matter whether that was specified by a
> DBG_VALUE or DBG_INSTR_REF.  The tracking is sufficiently good to produce an
> identical binary to the output of current LiveDebugValues (details below).
> 
> There are some unrelated benefits to consider: because the new implementation
> tracks all locations that a value may be in, we can handle the following
> problem, that the current implementation cannot:
> 
>  entry:
>    $rax = somedef
>    DBG_VALUE $rax
>    SPILL_TO_STACK %somestackslot, $rax
>  loop:
>    [... some code...]
>    $rax = RESTORE_FROM_STACK %somestackslot
>    do_things_with_rax
>    brcc loop
>  exit:
>    ret
> 
> Here, there's a variable spilt to the stack, that's temporarily restored
> in the loop. At the end of the "loop" block, the value has two locations:
> $rax and the stack slot. The current LiveDebugValues always follows the
> restore from the stack: it decides the variable is on the stack at the end
> of "entry", and in $rax at the end of "loop". Because these locations don't
> agree on entry to "loop", the location in "loop" is dropped completely.
> Having LiveDebugValues ignore or prefer the restore just changes what kind of
> circumstances locations get dropped in: the underlying problem is that the
> value can be in multiple places, but we can track only one of those places.
> 
> The new implementation has enough information to solve this, as it tracks all
> values in all machine locations. Unfortunately it doesn't make much difference
> to the location statistics for clang RelWithDebInfo builds. An additional
> benefit is that we would be better placed to produce saner locations lists [4].
> 
> In terms of performance, the new implementation takes on average just as long
> as the current implementation. A full stage2 build of clang may be 0.1% slower,
> but it's in the noise on my machine. All the algorithms are much more localised
> than the current implementation (see below), and I reckon there's still scope
> for further optimisation as it's easier to make decisions such as "Don't try to
> join these variable locations in this block, they're immediately re-stated by
> DBG_VALUEs".
> 
> # Implementation
> 
> The new implementation strictly does more work than the old one: it tracks every
> value in the function, not just those that are associated with a variable
> location. It also has to detect places where control flow merges and produces
> PHI values, where the old implementation could just check that the live-out
> variable location in predecessors agreed. To address this, I've separated the
> implementation into two smaller problems, that can be solved almost
> independently and with much greater locality.
> 
> Both of the problems below are sort-of-like SSA construction (they have a set
> of defs, and a large set of uses), but instead of calculating new PHI
> placements, we have to infer where PHIs _were_ placed.
> 
> Problem 1: Machine value numbering. Every def in the function is given a
> number, which is then condensed into a transfer function for each block,
> specifying which registers (or spill locs) contain which values. It's
> genuinely just a map of machine location to value. Using the same sort of
> dataflow model that LiveDebugValues currently uses, values are propagated to
> compute the live-ins and live-out values of each block, with one significant
> modification: we assign a new value number when control flow merges different
> values, essentially re-computing PHI nodes. This leads to some ungraceful
> code: see "Ugly PHI stuff" below. This problem is kept localised by only
> considering the registers and spill slots that are accessed, making the
> problem size proportionate to the product of the max working set of the
> function and the number of blocks. The number of variables used is irrelevant.
> 
> Problem 2: Variable value assignment. By using the solution for the first
> problem, of "what value number does each machine location contain" at each
> program point, every DBG_VALUE can be interpreted as assigning a value to a
> variable, not a location. This gets turned into a transfer function too, and
> we can then perform another dataflow analysis, propagating what _value_ a
> variable has in a block, and computing the set of live-in variable values for
> each block. A specific machine location can be picked after this problem is
> solved, possibly from several options if multiple locations have the same
> value. This isn't a completely clean abstraction: predecessors with different
> variable values that are in the same location should be propagated, but the
> new implementation has to find and validate a PHI value number that represents
> those merged values (see "More ugly PHI stuff" below).
> 
> I'm keeping problem 2 localised by propagating variable values for one lexical
> scope at a time, and only over the blocks that are in that scope. As well as
> localising by turning the main problem into a large number of small problems,
> this reduces the amount of work too. The current implementation filters out
> variable locations that are out of lexical scope every time a block is
> processed, which now no longer needs to happen. Variables in the outermost
> scopes won't be re-processed when LiveDebugValues goes around an inner-scope
> loop several times; likewise variables in inner scopes won't get re-analysed
> when an outer scope variable needs another pass through the function. This also
> means that all lexical scopes that cover a single block (or zero) require no
> dataflow analysis, which should attenuate the effects of heavy inlining a bit.
> 
> The solution to these two problems (block live-in variable values and block
> live-in machine location values) can then be combined in a final step through
> the function, producing DBG_VALUEs for live-ins and DBG_VALUEs for any
> location transfers within a block.
> 
> # Testing and validation
> 
> The new LiveDebugValues implementation produces an identical AMD64 clang-3.4
> binary to the old implementation, based on d481e59863a (Feb 2020). Well,
> almost -- it proved too difficult to completely emulate the old version,
> instead with the following changes [5] applied to d481e59863a, the two
> implementations produce an identical binary:
> * Applied D67500 [6],
> * Ignore identity copies,
> * Force any newly created DBG_VALUEs to be inserted according to the order
>   in which DebugVariables appear in the input function. So that they're
>   consistent between the two implementations.
> * Disabling all tracking of stack spills.
> 
> Ignoring the final point, I think this would demonstrate that the two
> implementations are equivalent. Stack spills are tricky: because value
> transfers within blocks are left until the final step of the new
> implementation, it's hard to work out in advance when old LiveDebugValues
> would have transferred a location. It was easier to just compare binaries
> where spills hadn't been followed.
> 
> IMO, the non-spill (the vast majority) locations being identical should be
> sufficient to show that the overall algorithm works, and dealing with spills
> is a minor extension.
> 
> The tests pass, except for the ones covered by the limitations section, and
> a few DBG_VALUE-change-order issues.
> 
> # Limitations and unimplemented features
> 
> There are a couple of things that I haven't gotten around to implementing
> yet, for which the tests fail:
> 
> Overlapping fragments: this is trivial to do, but at the back of my job queue,
> 
> Entry values: much more interesting. As Djordje has pointed out elsewhere,
> LiveDebugValues has been assuming that every DBG_VALUE assigns a new value
> to a variable [7], which means if a DBG_VALUE re-states a variable location
> when it doesn't have to, we risk dropping an entry value. This will be much
> easier in this new model (I haven't had the time to do it yet), because the
> machine value numbering means that each entry value has a specific number,
> and we can easily identify when it's no longer available. This should just
> be a matter of the in-block transfers tracking code identifying that a
> variable should have the entry value; it not being any machine location
> currently; and issuing an appropriate DBG_VALUE.
> 
> # Ugly PHI stuff
> 
> Above, I said that this problem is a lot closer to SSA construction than other
> classic compiler problems. However, it's actually even weirder: instead of
> having a set of uses and defs and computing the locations of PHIs: every
> instruction with a DebugLoc in lexical scope is a use, and the PHIs are already
> placed just not recorded. Thus the problems above are more about recovering
> the locations of PHIs after they're eliminated.
> 
> This adds a lot of complexity to the lattice structure used for each variable
> location compared to the current LiveDebugValues: there used to only ever be
> one candidate location for each variable at every control flow merge, and over
> the course of dataflow analysis this was found to be either true or false.
> However because we're iteratively discovering new PHI locations as we explore,
> the new implementation's lattice consists of:
> * A single candidate def (\top),
> * A PHI value at every control flow merge on the path between between the def
>   and the current block,
> * A PHI value at the current block (\bot).
> Each of which potentially needs one iteration of dataflow to see whether the
> join should produce that value.
> 
> This only truly needs exploring at loop heads, and the maximum number of
> iterations required is the depth of loops in the function, which is the same as
> the old LiveDebugValues implementation. However because all the PHI values
> have to be discovered, that number of iterations is required more often than
> the old implementation.
> 
> I wrote more about the lattice in the docu-comment in the new implementation
> [8]. I can produce a worked example too, but this email has gotten a bit
> wordy already.
> 
> # More ugly PHI stuff
> 
> The section above holds true for the variable values analysis too, where
> instead of defs we have variable assignments at DBG_VALUEs. Rather than
> discovering where PHIs happen and propagating that information, variable
> value analysis has to do the same to discover what machine-value PHI would be
> needed to merge a variable location from all predecessors, and then examine
> whether such a PHI is actually available, and has the right inputs. (This is
> annoyingly fiddly).
> 
> # Summary
> 
> This new implementation goes fast enough, is accurate enough, and appears
> to be sound. That isn't a reason to replace the current implementation; but
> eventually the instruction reference work would make use of this, so if
> possible I'd like to start review on it, if that's acceptable to people.
> 
> Seeing how you've just read a lot of words, a motivating reminder: this path
> can lead, slowly, to there being no debug instructions for variable locations!
> And as described in my RFC sitrep-email, it can be more accurate than
> location only tracking too.
> 
> [0] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139440.html
> [1] Branch is here:
> https://github.com/jmorse/llvm-project/commits/ldvars-separation , tip
> is 11450def5bc592321ece2a64ea2a8cabbc614c39 at time of sending. I
> could open a PR containing the changes, but github mucks them up; best
> to just look at the bare file [8].
> [2] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139469.html
> [3] https://github.com/llvm/llvm-project/blob/3626eba11f2367da88ccb0280a34731243278034/llvm/lib/CodeGen/LiveDebugValues.cpp#L9
> [4] https://bugs.llvm.org/show_bug.cgi?id=42834
> [5] https://github.com/jmorse/llvm-project/commit/720876fec2a5113d52977d00e0e1f7d526b4c1f0
> four commits from here
> [6] https://reviews.llvm.org/D67500 , I probably should push to land this
> [7] It's not a bad view, and definitely something I've been preaching,
> [8] https://github.com/jmorse/llvm-project/blob/11450def5bc592321ece2a64ea2a8cabbc614c39/llvm/lib/CodeGen/LiveDebugValues.cpp#L60
> 
> --
> Thanks for reading this essay,
> Jeremy


vedant
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200625/104a0664/attachment.html>


More information about the llvm-dev mailing list