[llvm-dev] [RFC] A value-tracking LiveDebugValues implementation

Fri Jun 26 11:37:01 PDT 2020

(… trying to add back CC’s that were unintentionally dropped in my earlier mail — apologies.)

> On Jun 26, 2020, at 5:11 AM, Jeremy Morse via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> Hi Vedant,
> 
> On Thu, Jun 25, 2020 at 10:33 PM Vedant Kumar <vedant_kumar at apple.com> wrote:
>> Your earlier RFC sketches a plan for extending LiveDebugValues to handle instruction-referencing DBG_VALUEs. From http://lists.llvm.org/pipermail/llvm-dev/2020-February/139440.html --
>> 
>> How then do we translate this new kind of machine location into
>> DWARF/CodeView variable locations, which need to know which register
>> to look in? The answer is: LiveDebugValues [3]. We already perform a
>> dataflow analysis in LiveDebugValues of where values "go" after
>> they're defined: we can use that to take "defining instructions" and
>> determine register / stack locations. We would need to track values
>> from the defining instruction up to the DBG_VALUE where that value
>> becomes a variable location, after which it's no different from the
>> LiveDebugValues analysis that we perform today.
>> 
>> I had interpreted this as meaning: introduce transfer functions to walk from the referenced instruction until we reach the referencing DBG_VALUE, then create a VarLoc. IIUC, at that point the unmodified LiveDebugValues algorithm could proceed as usual.
> 
> I think that was my meaning at the time; and it's still broadly true.
> Unfortunately, it turns out that there can be pretty much an arbitrary
> computation between the referenced instruction and the referencing
> DBG_INSTR_REF. I ran across several scenarios where a referenced
> instruction gets CSE'd / PRE'd / hoisted away from the referencing
> DBG_INSTR_REF, to the extent that it's on the other side of a loop.
> Once it's there, a dataflow algorithm is needed to work out whether
> the value is live through the loop, or whether it gets clobbered
> somewhere.
> 
> That's what the "machine value number" problem is solving. It could
> still plug into the VarLoc implementation if that makes an easier
> transition, where a DBG_INSTR_REF is decomposed into a DBG_VALUE when
> an instruction reference / value number / location are matched up. It
> would make debug use-before-defs harder to deal with, though.
> 

> (It's worth noting at this point that the machine value number work
> is, in a sense, doing LiveDebugVariables' job, by linking up a
> register def with a use. Just, LiveDebugVariables has the register
> allocator to help it do it).
> 
>> Reading this RFC, I get the sense that that has several drawbacks: it doesn’t handle situations where the referenced instruction is in a different basic block than its referencing DBG_VALUE (all the ‘ugly phi stuff’ is missing!), and it doesn’t track all the locations for a variable. Is that correct?
> 
> This might be my overly-defensive writing style working against me. I
> believe those problems are solved, but the solutions are more
> complicated than I wanted. For:
> 
>> it doesn’t handle situations where the referenced instruction is in a different basic block than its referencing DBG_VALUE
> 
> This is solved by the machine value number dataflow; but as mentioned
> requires a more complex lattice to infer where PHI values are
> generated,
> 
>> and it doesn’t track all the locations for a variable
> 
> I believe the implementation correctly picks the _value_ for every
> variable in each block, and picking a location is a final
> post-processing stage.

I think we were on the same page. I was simply contrasting my earlier understanding of how LiveDebugValues would handle instruction-referencing DBG_VALUEs (with a single, block-local forward scan) to the value numbering approach used in your current prototype.

>> One thing I’m not yet clear on: does your prototype essentially implement the minimal extension to LiveDebugValues needed to handle instruction-referencing DBG_VALUEs? If not, should it?, or should we take the opportunity to remove limitations from the current LiveDebugValues algorithm?
> 
> I have the code for this hanging around, but not in the tree I linked
> to. It establishes a fairly trivial mapping between:
> * Instructions that are referred to, and
> * The machine value numbers that instruction produces.
> I'd suggest that it's not useful without DBG_INSTR_REFs as we wouldn't
> be able to test it., and testing it means bringing in patches to
> generate or parse DBG_INSTR_REFs, which I'd rather split up.

I see, I think it makes sense to get the value numbering piece in a testable state first, and prove that it’s at least as good as the current algorithm.

>> As for how to land this, I propose:
>> 
>> - Move LiveDebugValues.cpp to lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
>> - Split out common transfer functions and other helpers into lib/CodeGen/LiveDebugValues/Common.{h,cpp}
>> - Finally, add lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
>> - Add a cl::opt like -instr-ref-dbg-values to allow selecting a DBG_VALUE flavor
> 
> Sounds good to me,

I’m looking forward to this!

>> Yes, I’m in favor of landing work related to instruction-referencing DBG_VALUEs in small-ish independent patches, as they become ready. I’d very much like to keep the unmodified var-loc DBG_VALUE code paths intact and separated.
> 
> --
> Thanks,
> Jeremy

thanks,
vedant