[Lldb-commits] [PATCH] Profile Assembly Until Ret Instruction

Tong Shen endlessroad at google.com
Tue Aug 19 16:47:45 PDT 2014


Yes indeed that breaks stepping.

GDB uses eh_frame for frame 0.
In gdb, if you use 'si' to step over the call instruction:

               call label
=> .label: pop ebx

GDB will fail to unwind.

But if you use 's', it will step over these 2 instructions together.

So I'm guessing GDB does similar tricks.


On Tue, Aug 19, 2014 at 4:29 PM, Greg Clayton <gclayton at apple.com> wrote:

> The EH frame doesn't track the PIC bump stuff and that can/will hose up
> stepping.
>
> > On Aug 19, 2014, at 4:22 PM, Jason Molenda <jmolenda at apple.com> wrote:
> >
> > Hi Tong, my message was a little rambling.  Let's be specific.
> >
> > We are changing lldb to trust eh_frame instructions on the
> currently-executing aka 0th frame.
> >
> > In practice, gcc and clang eh_frame both describe the prologue, so this
> is OK.
> >
> > Old gcc and clang eh_frame do not describe the epilogue.  So we need to
> add a pass for i386/x86_64 (at least) to augment the eh_frame-sourced
> unwind instructions.  I don't know if it would be best to augment eh_frame
> UnwindPlans when we create them in DWARFCallFrameInfo or if it would be
> better to do it lazily when we are actually using the unwind instructions
> in RegisterContextLLDB (probably RegisterContextLLDB like you were doing).
> We should only do it once for a given function, of course.
> >
> > I think it would cleanest if the augmentation function lived in the
> UnwindAssembly class.  But I haven't looked how easy it is to get an
> UnwindAssembly object where we need it.
> >
> >
> > Thanks for taking this on.  It will be interesting to try living
> entirely off eh_frame and see how that works for all the
> architectures/environments lldb supports.
> >
> > I worry a little that we're depending on the generous eh_frame from
> clang/gcc and if we try to run on icc (Intel's compiler) or something like
> that, we may have no prologue instructions and stepping will work very
> poorly.  But we'll cross that bridge when we get to it.
> >
> >
> >
> >> On Aug 15, 2014, at 8:07 PM, Jason Molenda <jmolenda at apple.com> wrote:
> >>
> >> Hi Tong, sorry for the delay in replying.
> >>
> >> I have a couple thoughts about the patch.  First, the change in
> RegisterContextLLDB::GetFullUnwindPlanForFrame() forces the use of eh_frame
> unwind instructions ("UnwindPlanAtCallSite" - which normally means the
> eh_frame unwind instructions) for the currently-executing aka zeroth
> frame.  We've talked about this before, but it's worth noting that this
> patch includes that change.
> >>
> >> There's still the problem of detecting how *asynchronous* those
> eh_frame unwind instructions are.  For instance, what do you get for an
> i386 program that does
> >>
> >> #include <stdio.h>
> >> int main()
> >> {
> >> puts ("HI");
> >> }
> >>
> >> Most codegen will use a sequence like
> >>
> >> call LNextInstruction
> >> .LNextInstruction
> >> pop ebx
> >>
> >> this call & pop sequence is establishing the "pic base", it the program
> will then use that address to find the "HI" constant data.  If you compile
> this -fomit-frame-pointer, so we have to use the stack pointer to find the
> CFA, do the eh_frame instructions describe this?
> >>
> >> It's a bit of an extreme example but it's one of those tricky cases
> where asynchronous ("accurate at every instruction") unwind instructions
> and synchronous ("accurate at places where we can throw an exception, or a
> callee can throw an exception") unwind instructions are different.
> >>
> >>
> >> I would use behaves_like_zeroth_frame instead of if (IsFrameZero())
> because you can have a frame in the middle of the stack which was the
> zeroth frame when an asynchronous signal came in -- in which case, the
> "callee" stack frame will be sigtramp.
> >>
> >>
> >> You'd want to update the UnwindLogMsgVerbose() text, of course.
> >>
> >>
> >> What your DWARFCallFrameInfo::PatchUnwindPlanForX86() function is doing
> is assuming that the unwind plan fails to include an epilogue description,
> steps through all the instructions in the function looking for the epilogue.
> >>
> >> DWARFCallFrameInfo doesn't seem like the right place for this.  There's
> an assumption that the instructions came from eh_frame and that they are
> incomplete.  It seems like it would more naturally live in the
> UnwindAssembly plugin and it would have a name like
> AugmentIncompleteUnwindPlanWithEpilogue or something like that.
> >>
> >> What if the CFI already does describe the epilogue?  I imagine we'll
> just end up with a doubling of UnwindPlan Rows that describe the epilogue
> instructions.
> >>
> >> What if we have a mid-function epilogue?  I've never seen gcc/clang
> generate these for x86, but it's possible.  It's a common code sequence on
> arm/arm64.  You can see a messy bit of code in
> UnwindAssemblyInstEmulation::GetNonCallSiteUnwindPlanFromAssembly which
> handles these -- saving the UnwindPlan's unwind instructions when we see
> the beginning of an epilogue, and once the epilogue is complete, restoring
> the unwind instructions.
> >>
> >>
> >> I'm not opposed to the patch - but it does make the assumption that
> we're going to use eh_frame for the currently executing function and that
> the eh_frame instructions do not include a description of the epilogue.
> (and that there is only one epilogue in the function).  Mostly I want to
> call all of those aspects out so we're clear what we're talking about
> here.  Let's clean it up a bit, put it in and see how it goes.
> >>
> >> J
> >>
> >>
> >>> On Aug 14, 2014, at 6:31 PM, Tong Shen <endlessroad at google.com> wrote:
> >>>
> >>> Hi Jason,
> >>>
> >>> Turns out we still need CFI for frame 0 in certain situations...
> >>>
> >>> A possible approach is to disassemble machine code, and manually
> adjust CFI for frame 0. For example, if we see "pop ebp; => ret", we set
> cfa to [esp]; if we see "call next-insn; => pop %ebp", we set cfa_offset+=4.
> >>>
> >>> Patch attached, now it just implements adjustment for "pop ebp; ret".
> >>>
> >>> If you think this approach is OK, I will go ahead and add other
> tricks(i386 pc relative addressing, more styles of epilogue, etc).
> >>>
> >>> Thank you for your time!
> >>>
> >>>
> >>> On Thu, Jul 31, 2014 at 12:50 PM, Tong Shen <endlessroad at google.com>
> wrote:
> >>> I think gdb's rationale for using CFI for leaf function is:
> >>> - gcc always generate CFI for progolue, so at function entry, we know
> the correct CFA;
> >>> - any stack pointer altering operation after that(mid-function &
> epilogue), we can recognize and handle them.
> >>> So basically, it assumes 2, hacks its way through 3 & 4, and pretends
> we are at 5.
> >>> Number of hacks we need seems to be small in x86 world, so this
> tradition is still here.
> >>>
> >>> Here's what gdb does for epilogue: normally when you run 'n', it will
> run one instruction a time till the next line/different stack id. But when
> it sees "pop %rbp; ret", it won't step into these instructions. Instead it
> will execute past them directly.
> >>> I didn't experiment with x86 pc-relative addressing; but I guess it
> will also recognize and execute past this pattern directly.
> >>>
> >>> So for compiler generated functions, what we do now with assembly
> parser now can be done with CFI + those gdb hacks.
> >>> And for hand-written assembly, i think CFI is almost always precise at
> instruction level. In this case, utilizing CFI instead of assembly parser
> will be a big help.
> >>>
> >>> So maybe we can apply those hacks, and trust CFI only for x86 & x86_64
> targets?
> >>>
> >>>
> >>> On Thu, Jul 31, 2014 at 12:02 AM, Jason Molenda <jmolenda at apple.com>
> wrote:
> >>> I think we could think of five levels of eh_frame information:
> >>>
> >>>
> >>> 1 unwind instructions at exception throw locations & locations where a
> callee may throw an exception
> >>>
> >>> 2 unwind instructions that describe the prologue
> >>>
> >>> 3 unwind instructions that describe the epilogue at the end of the
> function
> >>>
> >>> 4 unwind instructions that describe mid-function epilogues (I see
> these on arm all the time, don't see them on x86 with compiler generated
> code - but we don't use eh_frame on arm at Apple, I'm just mentioning it
> for completeness)
> >>>
> >>> 5 unwind instructions that describe any changes mid-function needed to
> unwind at all instructions ("asynchronous unwind information")
> >>>
> >>>
> >>> The eh_frame section only guarantees #1.  gcc and clang always do #1
> and #2.  Modern gcc's do #3.  I don't know if gcc would do #4 on arm but
> it's not important, I just mention it for completeness.  And no one does #5
> (as far as I know), even in the DWARF debug_frame section.
> >>>
> >>> I think it maybe possible to detect if an eh_frame entry fulfills #3
> by looking if the CFA definition on the last row is the same as the initial
> CFA definition.  But I'm not sure how a debugger could use heuristics to
> determine much else.
> >>>
> >>>
> >>> In fact, detecting #3 may be the easiest thing to detect.  I'm not
> sure if the debugger could really detect #2 except maybe if the function
> had a standard prologue (push rbp, mov rsp rbp) and the eh_frame didn't
> describe the effects of these instructions, the debugger could know that
> the eh_frame does not describe the prologue.
> >>>
> >>>
> >>>
> >>>
> >>>> On Jul 30, 2014, at 6:58 PM, Tong Shen <endlessroad at google.com>
> wrote:
> >>>>
> >>>> Ah I understand now.
> >>>>
> >>>> Now prologue seems always included in CFI fro gcc & clang; and newer
> gcc includes epilogue as well.
> >>>> Maybe we can detect and use them when they are available?
> >>>>
> >>>>
> >>>> On Wed, Jul 30, 2014 at 6:44 PM, Jason Molenda <jmolenda at apple.com>
> wrote:
> >>>> Ah, it looks like gcc changed since I last looked at its eh_frame
> output.
> >>>>
> >>>> It's not a bug -- the eh_frame unwind instructions only need to be
> accurate at instructions where an exception can be thrown, or where a
> callee function can throw an exception.  There's no requirement to include
> prologue or epilogue instructions in the eh_frame.
> >>>>
> >>>> And unfortunately from lldb's perspective, when we see eh_frame we'll
> never know how descriptive it is.  If it's old-gcc or clang, it won't
> include epilogue instructions.  If it's from another compiler, it may not
> include any prologue/epilogue instructions at all.
> >>>>
> >>>> Maybe we could look over the UnwindPlan rows and see if the CFA
> definition of the last row matches the initial row's CFA definition.  That
> would show that the epilogue is described.  Unless it is a tail-call (aka
> noreturn) function - in which case the stack is never restored.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> On Jul 30, 2014, at 6:32 PM, Tong Shen <endlessroad at google.com>
> wrote:
> >>>>>
> >>>>> GCC seems to generate a row for epilogue.
> >>>>> Do you think this is a clang bug, or at least a discrepancy between
> clang & gcc?
> >>>>>
> >>>>> Source:
> >>>>> int f() {
> >>>>>     puts("HI\n");
> >>>>>     return 5;
> >>>>> }
> >>>>>
> >>>>> Compile option: only -g
> >>>>>
> >>>>> gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1)
> >>>>> clang version 3.5.0 (213114)
> >>>>>
> >>>>> Env: Ubuntu 14.04, x86_64
> >>>>>
> >>>>> drawfdump -F of clang binary:
> >>>>> <    2><0x00400530:0x00400559><f><fde offset 0x00000088 length:
> 0x0000001c><eh aug data len 0x0>
> >>>>>       0x00400530: <off cfa=08(r7) > <off r16=-8(cfa) >
> >>>>>       0x00400531: <off cfa=16(r7) > <off r6=-16(cfa) > <off
> r16=-8(cfa) >
> >>>>>       0x00400534: <off cfa=16(r6) > <off r6=-16(cfa) > <off
> r16=-8(cfa) >
> >>>>>
> >>>>> drawfdump -F of gcc binary:
> >>>>> <    1><0x0040052d:0x00400542><f><fde offset 0x00000070 length:
> 0x0000001c><eh aug data len 0x0>
> >>>>>       0x0040052d: <off cfa=08(r7) > <off r16=-8(cfa) >
> >>>>>       0x0040052e: <off cfa=16(r7) > <off r6=-16(cfa) > <off
> r16=-8(cfa) >
> >>>>>       0x00400531: <off cfa=16(r6) > <off r6=-16(cfa) > <off
> r16=-8(cfa) >
> >>>>>       0x00400541: <off cfa=08(r7) > <off r6=-16(cfa) > <off
> r16=-8(cfa) >
> >>>>>
> >>>>>
> >>>>> On Wed, Jul 30, 2014 at 5:43 PM, Jason Molenda <jmolenda at apple.com>
> wrote:
> >>>>> I'm open to trying to trust eh_frame at frame 0 for x86_64.  The
> lack of epilogue descriptions in eh_frame is the biggest problem here.
> >>>>>
> >>>>> When you "step" or "next" in the debugger, the debugger instruction
> steps across the source line until it gets to the next source line.  Every
> time it stops after an instruction step, it confirms that it is (1) between
> the start and end pc values for the source line, and (2) that the "stack
> id" (start address of the function + CFA address) is the same.  If it stops
> and the stack id has changed, for a "next" command, it will backtrace one
> stack frame to see if it stepped into a function.  If so, it sets a
> breakpoint on the return address and continues.
> >>>>>
> >>>>> If you switch lldb to prefer eh_frame instructions for x86_64, e.g.
> >>>>>
> >>>>> Index: source/Plugins/Process/Utility/RegisterContextLLDB.cpp
> >>>>> ===================================================================
> >>>>> --- source/Plugins/Process/Utility/RegisterContextLLDB.cpp
> (revision 214344)
> >>>>> +++ source/Plugins/Process/Utility/RegisterContextLLDB.cpp
> (working copy)
> >>>>> @@ -791,6 +791,22 @@
> >>>>>        }
> >>>>>    }
> >>>>>
> >>>>> +    // For x86_64 debugging, let's try using the eh_frame
> instructions even if this is the currently
> >>>>> +    // executing function (frame zero).
> >>>>> +    Target *target = exe_ctx.GetTargetPtr();
> >>>>> +    if (target
> >>>>> +        && (target->GetArchitecture().GetCore() ==
> ArchSpec::eCore_x86_64_x86_64h
> >>>>> +            || target->GetArchitecture().GetCore() ==
> ArchSpec::eCore_x86_64_x86_64))
> >>>>> +    {
> >>>>> +        unwind_plan_sp = func_unwinders_sp->GetUnwindPlanAtCallSite
> (m_current_offset_backed_up_one);
> >>>>> +        int valid_offset = -1;
> >>>>> +        if (IsUnwindPlanValidForCurrentPC(unwind_plan_sp,
> valid_offset))
> >>>>> +        {
> >>>>> +            UnwindLogMsgVerbose ("frame uses %s for full
> UnwindPlan, preferred over assembly profiling on x86_64",
> unwind_plan_sp->GetSourceName().GetCString());
> >>>>> +            return unwind_plan_sp;
> >>>>> +        }
> >>>>> +    }
> >>>>> +
> >>>>>    // Typically the NonCallSite UnwindPlan is the unwind created by
> inspecting the assembly language instructions
> >>>>>    if (behaves_like_zeroth_frame)
> >>>>>    {
> >>>>>
> >>>>>
> >>>>> you'll find that you have to "next" twice to step out of a
> function.  Why?  With a simple function like:
> >>>>>
> >>>>> * thread #1: tid = 0xaf31e, 0x0000000100000eb9 a.out`foo + 25 at
> a.c:5, queue = 'com.apple.main-thread', stop reason = step over
> >>>>>   #0: 0x0000000100000eb9 a.out`foo + 25 at a.c:5
> >>>>>  2    int foo ()
> >>>>>  3    {
> >>>>>  4        puts("HI");
> >>>>> -> 5        return 5;
> >>>>>  6    }
> >>>>>  7
> >>>>>  8    int bar ()
> >>>>> (lldb) disass
> >>>>> a.out`foo at a.c:3:
> >>>>>  0x100000ea0:  pushq  %rbp
> >>>>>  0x100000ea1:  movq   %rsp, %rbp
> >>>>>  0x100000ea4:  subq   $0x10, %rsp
> >>>>>  0x100000ea8:  leaq   0x6b(%rip), %rdi          ; "HI"
> >>>>>  0x100000eaf:  callq  0x100000efa               ; symbol stub for:
> puts
> >>>>>  0x100000eb4:  movl   $0x5, %ecx
> >>>>> -> 0x100000eb9:  movl   %eax, -0x4(%rbp)
> >>>>>  0x100000ebc:  movl   %ecx, %eax
> >>>>>  0x100000ebe:  addq   $0x10, %rsp
> >>>>>  0x100000ec2:  popq   %rbp
> >>>>>  0x100000ec3:  retq
> >>>>>
> >>>>>
> >>>>> if you do "next" lldb will instruction step, comparing the stack ID
> at every stop, until it gets to 0x100000ec3 at which point the stack ID
> will change.  The CFA address (which the eh_frame tells us is rbp+16) just
> changed to the caller's CFA address because we're about to return.  The
> eh_frame instructions really need to tell us that the CFA is now rsp+8 at
> 0x100000ec3.
> >>>>>
> >>>>> The end result is that you need to "next" twice to step out of a
> function.
> >>>>>
> >>>>> AssemblyParse_x86 has a special bit where it looks or the 'ret'
> instruction sequence at the end of the function -
> >>>>>
> >>>>>  // Now look at the byte at the end of the AddressRange for a
> limited attempt at describing the
> >>>>>   // epilogue.  We're looking for the sequence
> >>>>>
> >>>>>   //  [ 0x5d ] mov %rbp, %rsp
> >>>>>   //  [ 0xc3 ] ret
> >>>>>   //  [ 0xe8 xx xx xx xx ] call __stack_chk_fail  (this is sometimes
> the final insn in the function)
> >>>>>
> >>>>>   // We want to add a Row describing how to unwind when we're
> stopped on the 'ret' instruction where the
> >>>>>   // CFA is no longer defined in terms of rbp, but is now defined in
> terms of rsp like on function entry.
> >>>>>
> >>>>>
> >>>>> and adds an extra row of unwind details for that instruction.
> >>>>>
> >>>>>
> >>>>> I mention x86_64 as being a possible good test case here because I
> worry about the i386 picbase sequence (call next-instruction; pop $ebx)
> which occurs a lot.  But for x86_64, my main concern is the epilogues.
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Jul 30, 2014, at 2:52 PM, Tong Shen <endlessroad at google.com>
> wrote:
> >>>>>>
> >>>>>> Thanks Jason! That's a very informative post, clarify things a lot
> :-)
> >>>>>>
> >>>>>> Well I have to admit that my patch is specifically for certain kind
> of functions, and now I see that's not the general case.
> >>>>>>
> >>>>>> I did some experiment with gdb. gdb uses CFI for frame 0, either
> x86 or x86_64. It looks for FDE of frame 0, and do CFA calculations
> according to that.
> >>>>>>
> >>>>>> - For compiler generated functions: I think there are 2 usage
> scenarios for frame 0: breakpoint and signal.
> >>>>>>   - Breakpoints are usually at source line boundary instead of
> instruction boundary, and generally we won't be caught at stack pointer
> changing locations, so CFI is still valid.
> >>>>>>   - For signal, synchronous unwind table may not be sufficient
> here. But only stack changing instructions will cause incorrect CFA
> calculation, so it' not always the case.
> >>>>>> - For hand written assembly functions: from what I've seen, most of
> the time CFI is present and actually asynchronous.
> >>>>>> So it seems that in most cases, even with only synchronous unwind
> table, CFI is still correct.
> >>>>>>
> >>>>>> I believe we can trust eh_frame for frame 0 and use assembly
> profiling as fallback. If both failed, maybe code owner should use
> -fasynchronous-unwind-tables :-)
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Jul 29, 2014 at 4:59 PM, Jason Molenda <jmolenda at apple.com>
> wrote:
> >>>>>> It was a tricky one and got lost in the shuffle of a busy week.  I
> was always reluctant to try profiling all the instructions in a function.
> On x86, compiler generated code (gcc/clang anyway) is very simplistic about
> setting up the stack frame at the start and only having one epilogue - so
> anything fancier risked making mistakes and could possibly have a
> performance impact as we run functions through the disassembler.
> >>>>>>
> >>>>>> For hand-written assembly functions (which can be very creative
> with their prologue/epilogue and where it is placed), my position is that
> they should write eh_frame instructions in their assembly source to tell
> lldb where to find things.  There is one or two libraries on Mac OS X where
> we break the "ignore eh_frame for the currently executing function" because
> there are many hand-written assembly functions in there and the eh_frame is
> going to beat our own analysis.
> >>>>>>
> >>>>>>
> >>>>>> After I wrote the x86 unwinder, Greg and Caroline implemented the
> arm unwinder where it emulates every instruction in the function looking
> for prologue/epilogue instructions.  We haven't seen it having a
> particularly bad impact performance-wise (lldb only does this disassembly
> for functions that it finds on stacks during an execution run, and it saves
> the result so it won't re-compute it for a given function).  The clang
> armv7 codegen often has mid-function epilogues (early returns) which
> definitely complicated things and made it necessary to step through the
> entire function bodies.  There's a bunch of code I added to support these
> mid-function epilogues - I have to save the register save state when I see
> an instruction which looks like an epilogue, and when I see the final ret
> instruction (aka restoring the saved lr contents into pc), I re-install the
> register save state from before the epilogue started.
> >>>>>>
> >>>>>> These things always make me a little nervous because the
> instruction analyzer obviously is doing a static analysis so it knows
> nothing about flow control.  Tong's patch stops when it sees the first CALL
> instruction - but that's not right, that's just solving the problem for his
> particular function which doesn't have any CALL instructions before his
> prologue. :) You could imagine a function which saves a couple of
> registers, calls another function, then saves a couple more because it
> needs more scratch registers.
> >>>>>>
> >>>>>> If we're going to change to profiling deep into the function -- and
> I'm not opposed to doing that, it's been fine on arm -- we should just do
> the entire function I think.
> >>>>>>
> >>>>>>
> >>>>>> Another alternative would be to trust eh_frame on x86_64 at frame
> 0.  This is one of those things where there's not a great solution.  The
> unwind instructions in eh_frame are only guaranteed to be accurate for
> synchronous unwinds -- that is, they are only guaranteed to be accurate at
> places where an exception could be thrown - at call sites.  So for
> instances, there's no reason why the compiler has to describe the function
> prologue instructions at all.  There's no requirement that the eh_frame
> instructions describe the epilogue instructions.  The information about
> spilled registers only needs to be emitted where we could throw an
> exception, or where a callee could throw an exception.
> >>>>>>
> >>>>>> clang/gcc both emit detailed instructions for the prologue setup.
> But for i386 codegen if the compiler needs to access some pc-relative data,
> it will do a "call next-instruction; pop %eax" to get the current pc
> value.  (x86_64 has rip-relative addressing so this isn't needed)  If
> you're debugging -fomit-frame-pointer code, that means your CFA is
> expressed in terms of the stack pointer and the stack pointer just changed
> mid-function --- and eh_frame instructions don't describe this.
> >>>>>>
> >>>>>> The end result: If you want accurate unwinds 100% of the time, you
> can't rely on the unwind instructions from eh_frame.  But they'll get you
> accurate unwinds 99.9% of the time ...  also, last I checked, neither clang
> nor gcc describe the epilogue instructions.
> >>>>>>
> >>>>>>
> >>>>>> In *theory* the unwind instructions from the DWARF debug_frame
> section should be asynchronous -- they should describe how to find the CFA
> address for every instruction in the function.  Which makes sense - you
> want eh_frame to be compact because it's bundled into the executable, so it
> should only have the information necessary for exception handling and you
> can put the verbose stuff in debug_frame DWARF for debuggers.  But instead
> (again, last time I checked), the compilers put the exact same thing in
> debug_frame even if you use the -fasynchronous-unwind-tables (or whatever
> that switch was) option.
> >>>>>>
> >>>>>>
> >>>>>> So I don't know, maybe we should just start trusting eh_frame at
> frame 0 and write off those .1% cases where it isn't correct instead of
> trying to get too fancy with the assembly analysis code.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> On Jul 29, 2014, at 4:17 PM, Todd Fiala <tfiala at google.com> wrote:
> >>>>>>>
> >>>>>>> Hey Jason,
> >>>>>>>
> >>>>>>> Do you have any feedback on this?
> >>>>>>>
> >>>>>>> Thanks!
> >>>>>>>
> >>>>>>> -Todd
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Jul 25, 2014 at 1:42 PM, Tong Shen <endlessroad at google.com>
> wrote:
> >>>>>>> Sorry, wrong version of patch...
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Jul 25, 2014 at 1:41 PM, Tong Shen <endlessroad at google.com>
> wrote:
> >>>>>>> Hi Molenda, lldb-commits,
> >>>>>>>
> >>>>>>> For now, x86 assembly profiler will stop after 10 "non-prologue"
> instructions. In practice it may not be sufficient. For example, we have a
> hand-written assembly function, which have hundreds of instruction before
> actual (stack-adjusting) prologue instructions.
> >>>>>>>
> >>>>>>> One way is to change the limit to 1000; but there will always be
> functions that break the limit :-) I believe the right thing to do here is
> parsing all instructions before "ret"/"call" as prologue instructions.
> >>>>>>>
> >>>>>>> Here's what I changed:
> >>>>>>> - For "push %rbx" and "mov %rbx, -8(%rbp)": only add first row for
> that register. They may appear multiple times in function body. But as long
> as one of them appears, first appearance should be in prologue(If it's not
> in prologue, this function will not use %rbx, so these 2 instructions
> should not appear at all).
> >>>>>>> - Also monitor "add %rsp 0x20".
> >>>>>>> - Remove non prologue instruction count.
> >>>>>>> - Add "call" instruction detection, and stop parsing after it.
> >>>>>>>
> >>>>>>> Thanks.
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best Regards, Tong Shen
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best Regards, Tong Shen
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> lldb-commits mailing list
> >>>>>>> lldb-commits at cs.uiuc.edu
> >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Todd Fiala |   Software Engineer |     tfiala at google.com |
> 650-943-3180
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Best Regards, Tong Shen
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Best Regards, Tong Shen
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Best Regards, Tong Shen
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best Regards, Tong Shen
> >>>
> >>>
> >>>
> >>> --
> >>> Best Regards, Tong Shen
> >>> <adjust_cfi_for_frame_zero.patch>
> >>
> >
> > _______________________________________________
> > lldb-commits mailing list
> > lldb-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-commits
>
>


-- 
Best Regards, Tong Shen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-commits/attachments/20140819/6a198b50/attachment.html>


More information about the lldb-commits mailing list