[llvm-commits] [llvm] r128327 - in /llvm/trunk: include/llvm/CodeGen/MachineBasicBlock.h lib/CodeGen/AsmPrinter/DwarfDebug.cpp lib/CodeGen/AsmPrinter/DwarfDebug.h test/CodeGen/X86/dbg-merge-loc-entry.ll

Thu Jan 30 08:25:09 PST 2014

On Wed, Dec 4, 2013 at 11:40 PM, David Blaikie <dblaikie at gmail.com> wrote:

> On Wed, Dec 4, 2013 at 6:24 AM, Alexey Samsonov <samsonov at google.com>
> wrote:
> >
> > On Tue, Dec 3, 2013 at 9:21 PM, David Blaikie <dblaikie at gmail.com>
> wrote:
> >>
> >> On Tue, Dec 3, 2013 at 8:27 AM, Alexey Samsonov <samsonov at google.com>
> >> wrote:
> >> >
> >> > On Tue, Dec 3, 2013 at 11:16 AM, David Blaikie <dblaikie at gmail.com>
> >> > wrote:
> >> >>
> >> >> On Mon, Dec 2, 2013 at 8:00 AM, Alexey Samsonov <samsonov at google.com
> >
> >> >> wrote:
> >> >> > Hi David,
> >> >> >
> >> >> > Let me outline some ideas I got after reading this code as a
> >> >> > sanity-check-request - I have too little knowledge of DBG_VALUE
> >> >> > machine
> >> >> > instructions, working with machine functions, etc.
> >> >> >
> >> >> > On Fri, Nov 1, 2013 at 1:46 AM, David Blaikie <dblaikie at gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> To document some of my own state on this issue (I'm happy to take
> >> >> >> this
> >> >> >> into another thread or bug to discuss if people think there's a
> >> >> >> better
> >> >> >> context for such discussion)...
> >> >> >>
> >> >> >> In DwarfDebug::beginFunction there is a bunch of code (that could
> >> >> >> probably
> >> >> >> be refactored into some named functions to make it more
> documented)
> >> >> >> but
> >> >> >> here's the basic idea:
> >> >> >>
> >> >> >> For all MBB
> >> >> >>   For all MI
> >> >> >>     if it's a debug value
> >> >> >>       record the location in the 'history' for this variable
> >> >> >>       mark the register as in-use by a certain variable (what
> >> >> >> happens
> >> >> >> if
> >> >> >> we end up with two variables in the same register? I don't think
> >> >> >> this
> >> >> >> code
> >> >> >> can handle that)
> >> >> >
> >> >> >
> >> >> > I agree - there are real-life cases when several variables refer to
> >> >> > the
> >> >> > same
> >> >> > register - e.g. in ASan mode addresses of all local variables are
> >> >> > defined as
> >> >> > offsets from $rsp via DBG_VALUE instruction.
> >> >> > So, instead of [Register]->[Debug variable] mapping we need to have
> >> >> > [Register] -> [Vector of debug variables] here.
> >> >>
> >> >> Fair point. Though are we ever going to indirect through transient
> >> >> registers? (eg: "variable is in deref(reg+offset) where 'reg' isn't
> >> >> just the frame register live across the whole function)
> >> >
> >> >
> >> > Don't know, we might. In ASan we build all code with use-after-return
> >> > detection support:
> >> >
> >> > define i32 @_Z7my_funci(i32 %a) #0 {
> >> > entry:
> >> >   %MyAlloca = alloca [224 x i8], align 32
> >> > //<--------------------------------------- actual stack frame
> >> >   %0 = ptrtoint [224 x i8]* %MyAlloca to i64
> >> >   %1 = load i32* @__asan_option_detect_stack_use_after_return
> >> >   %2 = icmp ne i32 %1, 0
> >> >   br i1 %2, label %3, label %5
> >> >
> >> > ; <label>:3                                       ; preds = %entry
> >> >   %4 = call i64 @__asan_stack_malloc_2(i64 224, i64 %0)
> <-------------
> >> > "fake stack frame" returned by ASan runtime
> >> >   br label %5
> >> >
> >> > ; <label>:5                                       ; preds = %entry, %3
> >> >   %6 = phi i64 [ %0, %entry ], [ %4, %3 ]
> >> > <------------------------------
> >> > pointer to actual or fake stack frame, where all the local variables
> are
> >> > located.
> >> > <...>
> >> >   %9 = add i64 %6, 96
> >> >   %10 = inttoptr i64 %9 to i32*
> >> > <--------------------------------------------- user variable, given as
> >> > offset into either actual or fake stack frame
> >> > <...>
> >> >   call void @llvm.dbg.declare(metadata !{i32* %10}, metadata !18)
> >> >
> >> > I can imaging the situation where "%6" won't be spilled on stack, and
> >> > we'll
> >> > have to report variable location
> >> > as "reg storing %6 + offset" (I didn't verify this happens, though).
> >>
> >> OK - could be interesting. Be good to have a test case of this.
> >>
> >> >> If the only indirection is for stack cases, then we don't need to
> >> >> worry about clobbers I think - well, not in the traditional sense the
> >> >> way this algorithm is written. We might need to worry about stack
> >> >> reuse... which could get interesting. I'm not sure how that (lifetime
> >> >> markers) is modeled.
> >> >>
> >> >>
> >> >> > The code in this branch also inserts a range-terminator instruction
> >> >> > between
> >> >> > a pair DBG_VALUE instrs for the same variable if instructions in
> this
> >> >> > pair
> >> >> > are from different basic blocks (I assume, this is done for the
> same
> >> >> > purpose as inserting range-terminators in the second loop, and it
> can
> >> >> > hurt
> >> >> > us as well).
> >> >>
> >> >> Right - they're both parts of the same general algorithm to terminate
> >> >> at the end of BBs. There's no actual DBG_VALUE to terminate here,
> it's
> >> >> just implicit in this algorithm that the live ranges terminate at the
> >> >> BB boundary.
> >> >>
> >> >> >>     else
> >> >> >>       if this instruction clobbers any register
> >> >> >>         if this instruction is in the same MBB as the current
> >> >> >> defining
> >> >> >> debug value for the variable
> >> >> >>           record this instruction in the history (it will act as a
> >> >> >> debug
> >> >> >> loc range terminator) of the variable
> >> >> >>
> >> >> >> For all the variables
> >> >> >>   If the end of the history is debug value (not a non-debug
> >> >> >> instruction
> >> >> >> which would terminate the range)
> >> >> >>     If that debug value is in a different basic block than the
> final
> >> >> >> basic
> >> >> >> block in the function
> >> >> >>       add the last instruction in the debug values basic block to
> >> >> >> the
> >> >> >> history to terminate the range
> >> >> >>
> >> >> >
> >> >> > We terminate ranges too early - some registers may not in fact be
> >> >> > clobbered
> >> >> > between two different basic blocks. However, looks like we *have*
> all
> >> >> > the
> >> >> > information we need - DBG_VALUE instructions and
> >> >> > register-clobbering instructions - in DwarfDebug::beginFunction(),
> >> >> > and
> >> >> > should just properly compute location ranges. It is nontrivial,
> >> >> > though,
> >> >> > and
> >> >> > I think the main purpose range-terminator instructions were added
> is
> >> >> > to deal with confusion when MBB has more than one predecessor. We
> >> >> > need
> >> >> > to
> >> >> > somehow implement "phi-nodes" for variable locations...
> >> >> > Suppose we have:
> >> >> >
> >> >> > MBB1:
> >> >> >   dbg_loc("x") = %rax
> >> >> >   jmp MBB3
> >> >> > MBB2:
> >> >> >   dbg_loc("x") = %rdx
> >> >> >   jmp MBB3
> >> >> > MBB3:
> >> >> >   // where is "x" now?
> >> >> >
> >> >> > currently we add dbg_loc("x") = undef at the end of MBB1 and MBB2
> for
> >> >> > safety, but in general we should do a proper analysis.
> >> >>
> >> >> My thinking is that we shouldn't need to perform an analysis here -
> we
> >> >> should rely on the register allocator to tell us what it already
> knows
> >> >> about liveness. Recomputing this from an MBB analysis seems like we'd
> >> >> be redoing work that's already been performed.
> >> >
> >> >
> >> > I don't understand what liveness information you mean here. Looking at
> >> > MBB's, we know
> >> > which instruction clobbers which (physical) registers, sets of
> live-ins
> >> > for
> >> > MBBs,
> >> > and we know registers each dbg_value instruction refers to. Looks like
> >> > this
> >> > is the granularity
> >> > we need - location ranges for each variable in general start and end
> at
> >> > some
> >> > instructions in the
> >> > middle of machine basic blocks. That is, I can't imagine what kind of
> >> > data
> >> > we
> >> > may need from register allocator that would simplify the location
> range
> >> > building.
> >>
> >> If we have live-ins in the MBB, does that include rsp?
> >
> >
> > They don't, but I think we may safely assume that $rsp won't change in
> the
> > function.
>
> Right - that's one of my points. The frame pointer is beyond analysis
> - it's a core assumption of the whole system. So we might need to bake
> such assumptions into debug info as well, though I'm not sure that'll
> be the case. It'd be nice if we could bake it into the register
> allocator and have it exposed in some generic way.
>
> >> If so, then
> >> sure - we can use that to avoid termination of variables trivially by
> >> looking at the live-ins of the following block, right?
> >
> >
> > No. Live-ins are related to execution order (predecessor/successor MBBs),
> > while
> > location ranges are related to the the order in which MBBs are written
> to an
> > object file,
> > so I doubt we may use live-ins here (I guess there are not many pieces in
> > early stages
> > of LLVM CodeGen that actually care about the ordering of MBBs, but we
> do).
> >
> > One more (theoretical example):
> > MBB1:
> >   dbg_loc("a") = %r8
> >   jmp to MBB3
> > MBB2 (live-in set contains %r8 and %r9):
> >   <...>  <-------------------------------------- where is "a"?
> > MBB3:
> >   dbg_loc("a") = %r9
> >   jmp to MBB2
> >
> > In this example we can't continue the location-range for "a" even though
> the
> > register it's stored in is in the live-set of the
> > following basic block.
>
> Ah, right - thanks for the example. Because we don't have a phi-like
> structure that says "this live in is this variable".
>
> So the only thing for it would be to put new dbg_value intrinsics at
> the beginning of every block, which seems painful (we've already had
> some people complaining about explosions of debug_value intrinsics
> that we emit/need today, adding more would be unfortunate). I'd
> certainly like to avoid doing that for anything in a stack slot if we
> could.
>
> > On the contrary, if we're clever enough and MBB2 can
> > only be reached from MBB3, we may create a location-range
> > that would specify that "a" is located in %r9 even in MBB2.
> >
> >
> >>
> >> So we wouldn't
> >> need to do any graph walking, etc, which it sounded like you were
> >> suggesting.
> >>
> >>
> >> > Alternatively, we may move calculation of these ranges and insertion
> of
> >> > location-range-terminatiors earlier, but IMO it's easier to this in
> >> > AsmPrinter - after all location
> >> > ranges rely on labels, so we don't want to do any transformation of
> >> > MachineFunction once we
> >> > calculated location ranges for variables.
> >> >>
> >> >>
> >> >> Thing is, things like the frame pointer are sort of "beyond" register
> >> >> allocation - their blessing and immutability is an underlying
> >> >> assumption of the whole system. So I'm wondering if the right thing
> to
> >> >> do here is to never model frame index and frame pointer reg+offsets
> as
> >> >> dbg_values and instead model them in the same way we model allocas
> >> >> (indeed we could generalize the alloca handling, hopefully).
> >> >
> >> >
> >> > IIUC, you don't like the following situation happening now:
> >> >
> >> > MOV64mr %RSP, 1, %noreg, 48, %noreg, %R8<kill>; mem:ST8[FixedStack6]
> >> > DBG_VALUE %RSP, 48, !"c"; line no:2
> >> >
> >> > Why build a dbg_value and location range for the address "c"
> (deref($rsp
> >> > +
> >> > 48)), if this
> >> > address is spilled into a slot with frame index 6?
> >>
> >> I don't have a strong idea about how this should or shouldn't work -
> >> honestly I know very little about this code. So don't take my
> >> suggestions as dictates, I'm just tossing around ideas.
> >>
> >> But yes, I'm suggesting it might be possible to address the issue by
> >> avoiding DBG_VALUEs in that case.
> >>
> >> > Maybe it's possible to
> >> > avoid this,
> >> > but this means we'll have to dive into CodeGen, find the place where
> it
> >> > generates
> >> > mentioned dbg_value instruction, and instead of generating tell that
> >> > location of "c" can
> >> > be found using the given stack slot.
> >>
> >> That's what I had in mind, yes - go find where we special case allocas
> >> and put their locations into some side table and use that machinery.
> >
> >
> > Yep, allocas are put in a special table in MachineModuleInfo early at
> > SelectionDAG building,
> > and are handled separately (basically, we map each variable to a stack
> > slot).
> > But if we're going to use this for (a subset of) varaibles currently
> tracked
> > with dbg_value,
> > we'll probably have to distinguish "stack slot storing the variable" and
> > "stack slot storing the address of
> > the variable". Ew.
>
> For the non-trivial-pass-by-value cases you mean? Yes, that's one
> unfortunate part of this.
>
> >> Alternatively, if we're going to support reg offsets and we have
> >> liveness/live-in information that allows us to power all this stuff
> >> well, perhaps we could swing the other way and remove the existing
> >> alloca special case and track those with MI DBG_VALUEs too once we've
> >> addressed the problem. That would be a nice cleanup.
> >
> >
> > I see, so if we're going to make dbg_value -> location-range
> transformation
> > work
> > "properly", we may later remove the side table for allocas. I'm still not
> > convinced we
> > should gather additional information from CodeGen to build location
> ranges
> > in DwarfDebug,
> > so I guess I'll try to do a proof-of-concept rewrite of the current
> > algorithm and see if
> > it doesn't break anything and fix existing ASan problems (and Clang test
> you
> > provided).
>
> I'm not sure exactly what change to the algorithm you have in mind -
> if you're just talking about the algorithm that takes the current
> debug_value MI intrinsics and produces live ranges, I'm not sure how
> you're going to improve it with only those current debug_value
> intrinsics. I hope it's not too much work for you to do, given how
> much we're kind of "in the dark" here.
>

OK, proof of concept turned out to be happy 400-lines-of-code monster:
http://llvm-reviews.chandlerc.com/D2658
Sorry.

Still, I think the analysis there can allow us to be much more precise in
generating
location ranges for variables w/o modifying the CodeGen at all, and help us
in general case (i.e.
not only for variables that are addressed using the frame pointer).

I've verified that my patch works for the trivial cases of debug info
broken with ASan.
Do you have an example which is currently broken for plain Clang?

-- 
Alexey Samsonov, MSK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140130/a0bd29ab/attachment.html>