[llvm-commits] [llvm] r128327 - in /llvm/trunk: include/llvm/CodeGen/MachineBasicBlock.h lib/CodeGen/AsmPrinter/DwarfDebug.cpp lib/CodeGen/AsmPrinter/DwarfDebug.h test/CodeGen/X86/dbg-merge-loc-entry.ll

Fri Jan 31 15:24:00 PST 2014

On Thu, Jan 30, 2014 at 8:25 AM, Alexey Samsonov <samsonov at google.com> wrote:
>
> On Wed, Dec 4, 2013 at 11:40 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>> On Wed, Dec 4, 2013 at 6:24 AM, Alexey Samsonov <samsonov at google.com>
>> wrote:
>> >
>> > On Tue, Dec 3, 2013 at 9:21 PM, David Blaikie <dblaikie at gmail.com>
>> > wrote:
>> >>
>> >> On Tue, Dec 3, 2013 at 8:27 AM, Alexey Samsonov <samsonov at google.com>
>> >> wrote:
>> >> >
>> >> > On Tue, Dec 3, 2013 at 11:16 AM, David Blaikie <dblaikie at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> On Mon, Dec 2, 2013 at 8:00 AM, Alexey Samsonov
>> >> >> <samsonov at google.com>
>> >> >> wrote:
>> >> >> > Hi David,
>> >> >> >
>> >> >> > Let me outline some ideas I got after reading this code as a
>> >> >> > sanity-check-request - I have too little knowledge of DBG_VALUE
>> >> >> > machine
>> >> >> > instructions, working with machine functions, etc.
>> >> >> >
>> >> >> > On Fri, Nov 1, 2013 at 1:46 AM, David Blaikie <dblaikie at gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> To document some of my own state on this issue (I'm happy to take
>> >> >> >> this
>> >> >> >> into another thread or bug to discuss if people think there's a
>> >> >> >> better
>> >> >> >> context for such discussion)...
>> >> >> >>
>> >> >> >> In DwarfDebug::beginFunction there is a bunch of code (that could
>> >> >> >> probably
>> >> >> >> be refactored into some named functions to make it more
>> >> >> >> documented)
>> >> >> >> but
>> >> >> >> here's the basic idea:
>> >> >> >>
>> >> >> >> For all MBB
>> >> >> >>   For all MI
>> >> >> >>     if it's a debug value
>> >> >> >>       record the location in the 'history' for this variable
>> >> >> >>       mark the register as in-use by a certain variable (what
>> >> >> >> happens
>> >> >> >> if
>> >> >> >> we end up with two variables in the same register? I don't think
>> >> >> >> this
>> >> >> >> code
>> >> >> >> can handle that)
>> >> >> >
>> >> >> >
>> >> >> > I agree - there are real-life cases when several variables refer
>> >> >> > to
>> >> >> > the
>> >> >> > same
>> >> >> > register - e.g. in ASan mode addresses of all local variables are
>> >> >> > defined as
>> >> >> > offsets from $rsp via DBG_VALUE instruction.
>> >> >> > So, instead of [Register]->[Debug variable] mapping we need to
>> >> >> > have
>> >> >> > [Register] -> [Vector of debug variables] here.
>> >> >>
>> >> >> Fair point. Though are we ever going to indirect through transient
>> >> >> registers? (eg: "variable is in deref(reg+offset) where 'reg' isn't
>> >> >> just the frame register live across the whole function)
>> >> >
>> >> >
>> >> > Don't know, we might. In ASan we build all code with use-after-return
>> >> > detection support:
>> >> >
>> >> > define i32 @_Z7my_funci(i32 %a) #0 {
>> >> > entry:
>> >> >   %MyAlloca = alloca [224 x i8], align 32
>> >> > //<--------------------------------------- actual stack frame
>> >> >   %0 = ptrtoint [224 x i8]* %MyAlloca to i64
>> >> >   %1 = load i32* @__asan_option_detect_stack_use_after_return
>> >> >   %2 = icmp ne i32 %1, 0
>> >> >   br i1 %2, label %3, label %5
>> >> >
>> >> > ; <label>:3                                       ; preds = %entry
>> >> >   %4 = call i64 @__asan_stack_malloc_2(i64 224, i64 %0)
>> >> > <-------------
>> >> > "fake stack frame" returned by ASan runtime
>> >> >   br label %5
>> >> >
>> >> > ; <label>:5                                       ; preds = %entry,
>> >> > %3
>> >> >   %6 = phi i64 [ %0, %entry ], [ %4, %3 ]
>> >> > <------------------------------
>> >> > pointer to actual or fake stack frame, where all the local variables
>> >> > are
>> >> > located.
>> >> > <...>
>> >> >   %9 = add i64 %6, 96
>> >> >   %10 = inttoptr i64 %9 to i32*
>> >> > <--------------------------------------------- user variable, given
>> >> > as
>> >> > offset into either actual or fake stack frame
>> >> > <...>
>> >> >   call void @llvm.dbg.declare(metadata !{i32* %10}, metadata !18)
>> >> >
>> >> > I can imaging the situation where "%6" won't be spilled on stack, and
>> >> > we'll
>> >> > have to report variable location
>> >> > as "reg storing %6 + offset" (I didn't verify this happens, though).
>> >>
>> >> OK - could be interesting. Be good to have a test case of this.
>> >>
>> >> >> If the only indirection is for stack cases, then we don't need to
>> >> >> worry about clobbers I think - well, not in the traditional sense
>> >> >> the
>> >> >> way this algorithm is written. We might need to worry about stack
>> >> >> reuse... which could get interesting. I'm not sure how that
>> >> >> (lifetime
>> >> >> markers) is modeled.
>> >> >>
>> >> >>
>> >> >> > The code in this branch also inserts a range-terminator
>> >> >> > instruction
>> >> >> > between
>> >> >> > a pair DBG_VALUE instrs for the same variable if instructions in
>> >> >> > this
>> >> >> > pair
>> >> >> > are from different basic blocks (I assume, this is done for the
>> >> >> > same
>> >> >> > purpose as inserting range-terminators in the second loop, and it
>> >> >> > can
>> >> >> > hurt
>> >> >> > us as well).
>> >> >>
>> >> >> Right - they're both parts of the same general algorithm to
>> >> >> terminate
>> >> >> at the end of BBs. There's no actual DBG_VALUE to terminate here,
>> >> >> it's
>> >> >> just implicit in this algorithm that the live ranges terminate at
>> >> >> the
>> >> >> BB boundary.
>> >> >>
>> >> >> >>     else
>> >> >> >>       if this instruction clobbers any register
>> >> >> >>         if this instruction is in the same MBB as the current
>> >> >> >> defining
>> >> >> >> debug value for the variable
>> >> >> >>           record this instruction in the history (it will act as
>> >> >> >> a
>> >> >> >> debug
>> >> >> >> loc range terminator) of the variable
>> >> >> >>
>> >> >> >> For all the variables
>> >> >> >>   If the end of the history is debug value (not a non-debug
>> >> >> >> instruction
>> >> >> >> which would terminate the range)
>> >> >> >>     If that debug value is in a different basic block than the
>> >> >> >> final
>> >> >> >> basic
>> >> >> >> block in the function
>> >> >> >>       add the last instruction in the debug values basic block to
>> >> >> >> the
>> >> >> >> history to terminate the range
>> >> >> >>
>> >> >> >
>> >> >> > We terminate ranges too early - some registers may not in fact be
>> >> >> > clobbered
>> >> >> > between two different basic blocks. However, looks like we *have*
>> >> >> > all
>> >> >> > the
>> >> >> > information we need - DBG_VALUE instructions and
>> >> >> > register-clobbering instructions - in DwarfDebug::beginFunction(),
>> >> >> > and
>> >> >> > should just properly compute location ranges. It is nontrivial,
>> >> >> > though,
>> >> >> > and
>> >> >> > I think the main purpose range-terminator instructions were added
>> >> >> > is
>> >> >> > to deal with confusion when MBB has more than one predecessor. We
>> >> >> > need
>> >> >> > to
>> >> >> > somehow implement "phi-nodes" for variable locations...
>> >> >> > Suppose we have:
>> >> >> >
>> >> >> > MBB1:
>> >> >> >   dbg_loc("x") = %rax
>> >> >> >   jmp MBB3
>> >> >> > MBB2:
>> >> >> >   dbg_loc("x") = %rdx
>> >> >> >   jmp MBB3
>> >> >> > MBB3:
>> >> >> >   // where is "x" now?
>> >> >> >
>> >> >> > currently we add dbg_loc("x") = undef at the end of MBB1 and MBB2
>> >> >> > for
>> >> >> > safety, but in general we should do a proper analysis.
>> >> >>
>> >> >> My thinking is that we shouldn't need to perform an analysis here -
>> >> >> we
>> >> >> should rely on the register allocator to tell us what it already
>> >> >> knows
>> >> >> about liveness. Recomputing this from an MBB analysis seems like
>> >> >> we'd
>> >> >> be redoing work that's already been performed.
>> >> >
>> >> >
>> >> > I don't understand what liveness information you mean here. Looking
>> >> > at
>> >> > MBB's, we know
>> >> > which instruction clobbers which (physical) registers, sets of
>> >> > live-ins
>> >> > for
>> >> > MBBs,
>> >> > and we know registers each dbg_value instruction refers to. Looks
>> >> > like
>> >> > this
>> >> > is the granularity
>> >> > we need - location ranges for each variable in general start and end
>> >> > at
>> >> > some
>> >> > instructions in the
>> >> > middle of machine basic blocks. That is, I can't imagine what kind of
>> >> > data
>> >> > we
>> >> > may need from register allocator that would simplify the location
>> >> > range
>> >> > building.
>> >>
>> >> If we have live-ins in the MBB, does that include rsp?
>> >
>> >
>> > They don't, but I think we may safely assume that $rsp won't change in
>> > the
>> > function.
>>
>> Right - that's one of my points. The frame pointer is beyond analysis
>> - it's a core assumption of the whole system. So we might need to bake
>> such assumptions into debug info as well, though I'm not sure that'll
>> be the case. It'd be nice if we could bake it into the register
>> allocator and have it exposed in some generic way.
>>
>> >> If so, then
>> >> sure - we can use that to avoid termination of variables trivially by
>> >> looking at the live-ins of the following block, right?
>> >
>> >
>> > No. Live-ins are related to execution order (predecessor/successor
>> > MBBs),
>> > while
>> > location ranges are related to the the order in which MBBs are written
>> > to an
>> > object file,
>> > so I doubt we may use live-ins here (I guess there are not many pieces
>> > in
>> > early stages
>> > of LLVM CodeGen that actually care about the ordering of MBBs, but we
>> > do).
>> >
>> > One more (theoretical example):
>> > MBB1:
>> >   dbg_loc("a") = %r8
>> >   jmp to MBB3
>> > MBB2 (live-in set contains %r8 and %r9):
>> >   <...>  <-------------------------------------- where is "a"?
>> > MBB3:
>> >   dbg_loc("a") = %r9
>> >   jmp to MBB2
>> >
>> > In this example we can't continue the location-range for "a" even though
>> > the
>> > register it's stored in is in the live-set of the
>> > following basic block.
>>
>> Ah, right - thanks for the example. Because we don't have a phi-like
>> structure that says "this live in is this variable".
>>
>> So the only thing for it would be to put new dbg_value intrinsics at
>> the beginning of every block, which seems painful (we've already had
>> some people complaining about explosions of debug_value intrinsics
>> that we emit/need today, adding more would be unfortunate). I'd
>> certainly like to avoid doing that for anything in a stack slot if we
>> could.
>>
>> > On the contrary, if we're clever enough and MBB2 can
>> > only be reached from MBB3, we may create a location-range
>> > that would specify that "a" is located in %r9 even in MBB2.
>> >
>> >
>> >>
>> >> So we wouldn't
>> >> need to do any graph walking, etc, which it sounded like you were
>> >> suggesting.
>> >>
>> >>
>> >> > Alternatively, we may move calculation of these ranges and insertion
>> >> > of
>> >> > location-range-terminatiors earlier, but IMO it's easier to this in
>> >> > AsmPrinter - after all location
>> >> > ranges rely on labels, so we don't want to do any transformation of
>> >> > MachineFunction once we
>> >> > calculated location ranges for variables.
>> >> >>
>> >> >>
>> >> >> Thing is, things like the frame pointer are sort of "beyond"
>> >> >> register
>> >> >> allocation - their blessing and immutability is an underlying
>> >> >> assumption of the whole system. So I'm wondering if the right thing
>> >> >> to
>> >> >> do here is to never model frame index and frame pointer reg+offsets
>> >> >> as
>> >> >> dbg_values and instead model them in the same way we model allocas
>> >> >> (indeed we could generalize the alloca handling, hopefully).
>> >> >
>> >> >
>> >> > IIUC, you don't like the following situation happening now:
>> >> >
>> >> > MOV64mr %RSP, 1, %noreg, 48, %noreg, %R8<kill>; mem:ST8[FixedStack6]
>> >> > DBG_VALUE %RSP, 48, !"c"; line no:2
>> >> >
>> >> > Why build a dbg_value and location range for the address "c"
>> >> > (deref($rsp
>> >> > +
>> >> > 48)), if this
>> >> > address is spilled into a slot with frame index 6?
>> >>
>> >> I don't have a strong idea about how this should or shouldn't work -
>> >> honestly I know very little about this code. So don't take my
>> >> suggestions as dictates, I'm just tossing around ideas.
>> >>
>> >> But yes, I'm suggesting it might be possible to address the issue by
>> >> avoiding DBG_VALUEs in that case.
>> >>
>> >> > Maybe it's possible to
>> >> > avoid this,
>> >> > but this means we'll have to dive into CodeGen, find the place where
>> >> > it
>> >> > generates
>> >> > mentioned dbg_value instruction, and instead of generating tell that
>> >> > location of "c" can
>> >> > be found using the given stack slot.
>> >>
>> >> That's what I had in mind, yes - go find where we special case allocas
>> >> and put their locations into some side table and use that machinery.
>> >
>> >
>> > Yep, allocas are put in a special table in MachineModuleInfo early at
>> > SelectionDAG building,
>> > and are handled separately (basically, we map each variable to a stack
>> > slot).
>> > But if we're going to use this for (a subset of) varaibles currently
>> > tracked
>> > with dbg_value,
>> > we'll probably have to distinguish "stack slot storing the variable" and
>> > "stack slot storing the address of
>> > the variable". Ew.
>>
>> For the non-trivial-pass-by-value cases you mean? Yes, that's one
>> unfortunate part of this.
>>
>> >> Alternatively, if we're going to support reg offsets and we have
>> >> liveness/live-in information that allows us to power all this stuff
>> >> well, perhaps we could swing the other way and remove the existing
>> >> alloca special case and track those with MI DBG_VALUEs too once we've
>> >> addressed the problem. That would be a nice cleanup.
>> >
>> >
>> > I see, so if we're going to make dbg_value -> location-range
>> > transformation
>> > work
>> > "properly", we may later remove the side table for allocas. I'm still
>> > not
>> > convinced we
>> > should gather additional information from CodeGen to build location
>> > ranges
>> > in DwarfDebug,
>> > so I guess I'll try to do a proof-of-concept rewrite of the current
>> > algorithm and see if
>> > it doesn't break anything and fix existing ASan problems (and Clang test
>> > you
>> > provided).
>>
>> I'm not sure exactly what change to the algorithm you have in mind -
>> if you're just talking about the algorithm that takes the current
>> debug_value MI intrinsics and produces live ranges, I'm not sure how
>> you're going to improve it with only those current debug_value
>> intrinsics. I hope it's not too much work for you to do, given how
>> much we're kind of "in the dark" here.
>
>
> OK, proof of concept turned out to be happy 400-lines-of-code monster:
> http://llvm-reviews.chandlerc.com/D2658
> Sorry.
>
> Still, I think the analysis there can allow us to be much more precise in
> generating
> location ranges for variables w/o modifying the CodeGen at all, and help us
> in general case (i.e.
> not only for variables that are addressed using the frame pointer).
>
> I've verified that my patch works for the trivial cases of debug info broken
> with ASan.
> Do you have an example which is currently broken for plain Clang?
>

The one Dave posted from the beginning of the thread?

-eric

> --
> Alexey Samsonov, MSK