[llvm-commits] [llvm] r128327 - in /llvm/trunk: include/llvm/CodeGen/MachineBasicBlock.h lib/CodeGen/AsmPrinter/DwarfDebug.cpp lib/CodeGen/AsmPrinter/DwarfDebug.h test/CodeGen/X86/dbg-merge-loc-entry.ll

Wed Dec 4 11:40:47 PST 2013

On Wed, Dec 4, 2013 at 6:24 AM, Alexey Samsonov <samsonov at google.com> wrote:
>
> On Tue, Dec 3, 2013 at 9:21 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>> On Tue, Dec 3, 2013 at 8:27 AM, Alexey Samsonov <samsonov at google.com>
>> wrote:
>> >
>> > On Tue, Dec 3, 2013 at 11:16 AM, David Blaikie <dblaikie at gmail.com>
>> > wrote:
>> >>
>> >> On Mon, Dec 2, 2013 at 8:00 AM, Alexey Samsonov <samsonov at google.com>
>> >> wrote:
>> >> > Hi David,
>> >> >
>> >> > Let me outline some ideas I got after reading this code as a
>> >> > sanity-check-request - I have too little knowledge of DBG_VALUE
>> >> > machine
>> >> > instructions, working with machine functions, etc.
>> >> >
>> >> > On Fri, Nov 1, 2013 at 1:46 AM, David Blaikie <dblaikie at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> To document some of my own state on this issue (I'm happy to take
>> >> >> this
>> >> >> into another thread or bug to discuss if people think there's a
>> >> >> better
>> >> >> context for such discussion)...
>> >> >>
>> >> >> In DwarfDebug::beginFunction there is a bunch of code (that could
>> >> >> probably
>> >> >> be refactored into some named functions to make it more documented)
>> >> >> but
>> >> >> here's the basic idea:
>> >> >>
>> >> >> For all MBB
>> >> >>   For all MI
>> >> >>     if it's a debug value
>> >> >>       record the location in the 'history' for this variable
>> >> >>       mark the register as in-use by a certain variable (what
>> >> >> happens
>> >> >> if
>> >> >> we end up with two variables in the same register? I don't think
>> >> >> this
>> >> >> code
>> >> >> can handle that)
>> >> >
>> >> >
>> >> > I agree - there are real-life cases when several variables refer to
>> >> > the
>> >> > same
>> >> > register - e.g. in ASan mode addresses of all local variables are
>> >> > defined as
>> >> > offsets from $rsp via DBG_VALUE instruction.
>> >> > So, instead of [Register]->[Debug variable] mapping we need to have
>> >> > [Register] -> [Vector of debug variables] here.
>> >>
>> >> Fair point. Though are we ever going to indirect through transient
>> >> registers? (eg: "variable is in deref(reg+offset) where 'reg' isn't
>> >> just the frame register live across the whole function)
>> >
>> >
>> > Don't know, we might. In ASan we build all code with use-after-return
>> > detection support:
>> >
>> > define i32 @_Z7my_funci(i32 %a) #0 {
>> > entry:
>> >   %MyAlloca = alloca [224 x i8], align 32
>> > //<--------------------------------------- actual stack frame
>> >   %0 = ptrtoint [224 x i8]* %MyAlloca to i64
>> >   %1 = load i32* @__asan_option_detect_stack_use_after_return
>> >   %2 = icmp ne i32 %1, 0
>> >   br i1 %2, label %3, label %5
>> >
>> > ; <label>:3                                       ; preds = %entry
>> >   %4 = call i64 @__asan_stack_malloc_2(i64 224, i64 %0)   <-------------
>> > "fake stack frame" returned by ASan runtime
>> >   br label %5
>> >
>> > ; <label>:5                                       ; preds = %entry, %3
>> >   %6 = phi i64 [ %0, %entry ], [ %4, %3 ]
>> > <------------------------------
>> > pointer to actual or fake stack frame, where all the local variables are
>> > located.
>> > <...>
>> >   %9 = add i64 %6, 96
>> >   %10 = inttoptr i64 %9 to i32*
>> > <--------------------------------------------- user variable, given as
>> > offset into either actual or fake stack frame
>> > <...>
>> >   call void @llvm.dbg.declare(metadata !{i32* %10}, metadata !18)
>> >
>> > I can imaging the situation where "%6" won't be spilled on stack, and
>> > we'll
>> > have to report variable location
>> > as "reg storing %6 + offset" (I didn't verify this happens, though).
>>
>> OK - could be interesting. Be good to have a test case of this.
>>
>> >> If the only indirection is for stack cases, then we don't need to
>> >> worry about clobbers I think - well, not in the traditional sense the
>> >> way this algorithm is written. We might need to worry about stack
>> >> reuse... which could get interesting. I'm not sure how that (lifetime
>> >> markers) is modeled.
>> >>
>> >>
>> >> > The code in this branch also inserts a range-terminator instruction
>> >> > between
>> >> > a pair DBG_VALUE instrs for the same variable if instructions in this
>> >> > pair
>> >> > are from different basic blocks (I assume, this is done for the same
>> >> > purpose as inserting range-terminators in the second loop, and it can
>> >> > hurt
>> >> > us as well).
>> >>
>> >> Right - they're both parts of the same general algorithm to terminate
>> >> at the end of BBs. There's no actual DBG_VALUE to terminate here, it's
>> >> just implicit in this algorithm that the live ranges terminate at the
>> >> BB boundary.
>> >>
>> >> >>     else
>> >> >>       if this instruction clobbers any register
>> >> >>         if this instruction is in the same MBB as the current
>> >> >> defining
>> >> >> debug value for the variable
>> >> >>           record this instruction in the history (it will act as a
>> >> >> debug
>> >> >> loc range terminator) of the variable
>> >> >>
>> >> >> For all the variables
>> >> >>   If the end of the history is debug value (not a non-debug
>> >> >> instruction
>> >> >> which would terminate the range)
>> >> >>     If that debug value is in a different basic block than the final
>> >> >> basic
>> >> >> block in the function
>> >> >>       add the last instruction in the debug values basic block to
>> >> >> the
>> >> >> history to terminate the range
>> >> >>
>> >> >
>> >> > We terminate ranges too early - some registers may not in fact be
>> >> > clobbered
>> >> > between two different basic blocks. However, looks like we *have* all
>> >> > the
>> >> > information we need - DBG_VALUE instructions and
>> >> > register-clobbering instructions - in DwarfDebug::beginFunction(),
>> >> > and
>> >> > should just properly compute location ranges. It is nontrivial,
>> >> > though,
>> >> > and
>> >> > I think the main purpose range-terminator instructions were added is
>> >> > to deal with confusion when MBB has more than one predecessor. We
>> >> > need
>> >> > to
>> >> > somehow implement "phi-nodes" for variable locations...
>> >> > Suppose we have:
>> >> >
>> >> > MBB1:
>> >> >   dbg_loc("x") = %rax
>> >> >   jmp MBB3
>> >> > MBB2:
>> >> >   dbg_loc("x") = %rdx
>> >> >   jmp MBB3
>> >> > MBB3:
>> >> >   // where is "x" now?
>> >> >
>> >> > currently we add dbg_loc("x") = undef at the end of MBB1 and MBB2 for
>> >> > safety, but in general we should do a proper analysis.
>> >>
>> >> My thinking is that we shouldn't need to perform an analysis here - we
>> >> should rely on the register allocator to tell us what it already knows
>> >> about liveness. Recomputing this from an MBB analysis seems like we'd
>> >> be redoing work that's already been performed.
>> >
>> >
>> > I don't understand what liveness information you mean here. Looking at
>> > MBB's, we know
>> > which instruction clobbers which (physical) registers, sets of live-ins
>> > for
>> > MBBs,
>> > and we know registers each dbg_value instruction refers to. Looks like
>> > this
>> > is the granularity
>> > we need - location ranges for each variable in general start and end at
>> > some
>> > instructions in the
>> > middle of machine basic blocks. That is, I can't imagine what kind of
>> > data
>> > we
>> > may need from register allocator that would simplify the location range
>> > building.
>>
>> If we have live-ins in the MBB, does that include rsp?
>
>
> They don't, but I think we may safely assume that $rsp won't change in the
> function.

Right - that's one of my points. The frame pointer is beyond analysis
- it's a core assumption of the whole system. So we might need to bake
such assumptions into debug info as well, though I'm not sure that'll
be the case. It'd be nice if we could bake it into the register
allocator and have it exposed in some generic way.

>> If so, then
>> sure - we can use that to avoid termination of variables trivially by
>> looking at the live-ins of the following block, right?
>
>
> No. Live-ins are related to execution order (predecessor/successor MBBs),
> while
> location ranges are related to the the order in which MBBs are written to an
> object file,
> so I doubt we may use live-ins here (I guess there are not many pieces in
> early stages
> of LLVM CodeGen that actually care about the ordering of MBBs, but we do).
>
> One more (theoretical example):
> MBB1:
>   dbg_loc("a") = %r8
>   jmp to MBB3
> MBB2 (live-in set contains %r8 and %r9):
>   <...>  <-------------------------------------- where is "a"?
> MBB3:
>   dbg_loc("a") = %r9
>   jmp to MBB2
>
> In this example we can't continue the location-range for "a" even though the
> register it's stored in is in the live-set of the
> following basic block.

Ah, right - thanks for the example. Because we don't have a phi-like
structure that says "this live in is this variable".

So the only thing for it would be to put new dbg_value intrinsics at
the beginning of every block, which seems painful (we've already had
some people complaining about explosions of debug_value intrinsics
that we emit/need today, adding more would be unfortunate). I'd
certainly like to avoid doing that for anything in a stack slot if we
could.

> On the contrary, if we're clever enough and MBB2 can
> only be reached from MBB3, we may create a location-range
> that would specify that "a" is located in %r9 even in MBB2.
>
>
>>
>> So we wouldn't
>> need to do any graph walking, etc, which it sounded like you were
>> suggesting.
>>
>>
>> > Alternatively, we may move calculation of these ranges and insertion of
>> > location-range-terminatiors earlier, but IMO it's easier to this in
>> > AsmPrinter - after all location
>> > ranges rely on labels, so we don't want to do any transformation of
>> > MachineFunction once we
>> > calculated location ranges for variables.
>> >>
>> >>
>> >> Thing is, things like the frame pointer are sort of "beyond" register
>> >> allocation - their blessing and immutability is an underlying
>> >> assumption of the whole system. So I'm wondering if the right thing to
>> >> do here is to never model frame index and frame pointer reg+offsets as
>> >> dbg_values and instead model them in the same way we model allocas
>> >> (indeed we could generalize the alloca handling, hopefully).
>> >
>> >
>> > IIUC, you don't like the following situation happening now:
>> >
>> > MOV64mr %RSP, 1, %noreg, 48, %noreg, %R8<kill>; mem:ST8[FixedStack6]
>> > DBG_VALUE %RSP, 48, !"c"; line no:2
>> >
>> > Why build a dbg_value and location range for the address "c" (deref($rsp
>> > +
>> > 48)), if this
>> > address is spilled into a slot with frame index 6?
>>
>> I don't have a strong idea about how this should or shouldn't work -
>> honestly I know very little about this code. So don't take my
>> suggestions as dictates, I'm just tossing around ideas.
>>
>> But yes, I'm suggesting it might be possible to address the issue by
>> avoiding DBG_VALUEs in that case.
>>
>> > Maybe it's possible to
>> > avoid this,
>> > but this means we'll have to dive into CodeGen, find the place where it
>> > generates
>> > mentioned dbg_value instruction, and instead of generating tell that
>> > location of "c" can
>> > be found using the given stack slot.
>>
>> That's what I had in mind, yes - go find where we special case allocas
>> and put their locations into some side table and use that machinery.
>
>
> Yep, allocas are put in a special table in MachineModuleInfo early at
> SelectionDAG building,
> and are handled separately (basically, we map each variable to a stack
> slot).
> But if we're going to use this for (a subset of) varaibles currently tracked
> with dbg_value,
> we'll probably have to distinguish "stack slot storing the variable" and
> "stack slot storing the address of
> the variable". Ew.

For the non-trivial-pass-by-value cases you mean? Yes, that's one
unfortunate part of this.

>> Alternatively, if we're going to support reg offsets and we have
>> liveness/live-in information that allows us to power all this stuff
>> well, perhaps we could swing the other way and remove the existing
>> alloca special case and track those with MI DBG_VALUEs too once we've
>> addressed the problem. That would be a nice cleanup.
>
>
> I see, so if we're going to make dbg_value -> location-range transformation
> work
> "properly", we may later remove the side table for allocas. I'm still not
> convinced we
> should gather additional information from CodeGen to build location ranges
> in DwarfDebug,
> so I guess I'll try to do a proof-of-concept rewrite of the current
> algorithm and see if
> it doesn't break anything and fix existing ASan problems (and Clang test you
> provided).

I'm not sure exactly what change to the algorithm you have in mind -
if you're just talking about the algorithm that takes the current
debug_value MI intrinsics and produces live ranges, I'm not sure how
you're going to improve it with only those current debug_value
intrinsics. I hope it's not too much work for you to do, given how
much we're kind of "in the dark" here.