[lldb-dev] Inquiry regarding AddOneMoreFrame function in UnWindLLDB

Ravitheja Addepally via lldb-dev lldb-dev at lists.llvm.org
Fri Jun 3 02:44:35 PDT 2016


On Thu, Jun 2, 2016 at 9:58 PM, Jason Molenda <jmolenda at apple.com> wrote:

> This has no eh_frame unwind instructions.  Even if we were using eh_frame
> at frame 0, you'd be out of luck.
>

     I did not understand how eh_frame unwind instructions are not there,
pardon me asking, can you tell me how you inferred that ?


>
> I forget the exact order of fallbacks.  I think for frame 0 we try to use
> the assembly profile unwind ("async unwind plan") and if we can't do that
> we fall back to the eh_frame unwind ("sync unwind plan") and as a last
> resort we'll use the architecture default unwind plan.  Which, for a stack
> frame like this that doesn't do the usual push rbp; mov rsp, rbp sequence,
> means we'll skip at least one stack frame.
>
> The assembly inspection unwind plan from AssemblyParse_x86 looks correct
> to me. This function saves some register on the stack (all of them argument
> or volatile registers, so that's weird & the assembly profiler probably
> won't record them; whatever), calls a function, restores the reg values and
> then jumps to the returned function pointer from that first func call.
> Maybe this is some dynamic loader fixup routine for the first time an
> external function is called and the solib needs to be paged in.
>
> You're stopped in the body of the function (offset 86) where the stack
> pointer is still as expected.  I'd have to think about that unwind entry
> for offset +94 (if you were stopped on the jmp instruction) a bit more -
> that's a bit unusual.  But unless you're on the jmp, I can't see this
> unwind going wrong.
>
>   The unwind plan for the first row in the assembly unwinder is wrong,
> its off by 2 bytes I don't know why it is so but the root cause of failure
> for TestPrintStackTraces is this, since it uses the assembly unwinder, its
> not able to unwind correctly and the eh_frame based info is actually
> correct. I did verify that when the eh_frame info is used, the stack
> unwinding is correct in this case.
> J
>
> > On Jun 2, 2016, at 1:48 AM, Ravitheja Addepally <ravithejawork at gmail.com>
> wrote:
> >
> > Hello,
> >          This is happening in TestPrintStackTraces, where we can end up
> here:
> > ld-linux-x86-64.so.2`___lldb_unnamed_symbol95$$ld-linux-x86-64.so.2:
> >     0x7ffff7df04e0 <+0>:  48 83 ec 38
>  subq   $0x38, %rsp
> >     0x7ffff7df04e4 <+4>:  48 89 04 24
>  movq   %rax, (%rsp)
> >     0x7ffff7df04e8 <+8>:  48 89 4c 24 08
> movq   %rcx, 0x8(%rsp)
> >     0x7ffff7df04ed <+13>: 48 89 54 24 10
> movq   %rdx, 0x10(%rsp)
> >     0x7ffff7df04f2 <+18>: 48 89 74 24 18
> movq   %rsi, 0x18(%rsp)
> >     0x7ffff7df04f7 <+23>: 48 89 7c 24 20
> movq   %rdi, 0x20(%rsp)
> >     0x7ffff7df04fc <+28>: 4c 89 44 24 28
> movq   %r8, 0x28(%rsp)
> >     0x7ffff7df0501 <+33>: 4c 89 4c 24 30
> movq   %r9, 0x30(%rsp)
> >     0x7ffff7df0506 <+38>: 48 8b 74 24 40
> movq   0x40(%rsp), %rsi
> >     0x7ffff7df050b <+43>: 48 8b 7c 24 38
> movq   0x38(%rsp), %rdi
> >     0x7ffff7df0510 <+48>: e8 4b 8f ff ff
> callq  0x7ffff7de9460            ;
> ___lldb_unnamed_symbol54$$ld-linux-x86-64.so.2
> >     0x7ffff7df0515 <+53>: 49 89 c3
> movq   %rax, %r11
> >     0x7ffff7df0518 <+56>: 4c 8b 4c 24 30
> movq   0x30(%rsp), %r9
> >     0x7ffff7df051d <+61>: 4c 8b 44 24 28
> movq   0x28(%rsp), %r8
> >     0x7ffff7df0522 <+66>: 48 8b 7c 24 20
> movq   0x20(%rsp), %rdi
> >     0x7ffff7df0527 <+71>: 48 8b 74 24 18
> movq   0x18(%rsp), %rsi
> >     0x7ffff7df052c <+76>: 48 8b 54 24 10
> movq   0x10(%rsp), %rdx
> >     0x7ffff7df0531 <+81>: 48 8b 4c 24 08
> movq   0x8(%rsp), %rcx
> > ->  0x7ffff7df0536 <+86>: 48 8b 04 24
>  movq   (%rsp), %rax
> >     0x7ffff7df053a <+90>: 48 83 c4 48
>  addq   $0x48, %rsp
> >     0x7ffff7df053e <+94>: 41 ff e3
> jmpq   *%r11
> >     0x7ffff7df0541 <+97>: 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
> nopw   %cs:(%rax,%rax)
> >
> >
> > image show-unwind --address 0x7ffff7df0536
> > UNWIND PLANS for
> ld-linux-x86-64.so.2`___lldb_unnamed_symbol95$$ld-linux-x86-64.so.2 (start
> addr 0x7ffff7df04e0)
> >
> > Asynchronous (not restricted to call-sites) UnwindPlan is 'assembly insn
> profiling'
> > Synchronous (restricted to call-sites) UnwindPlan is 'eh_frame CFI'
> >
> > Assembly language inspection UnwindPlan:
> > This UnwindPlan originally sourced from assembly insn profiling
> > This UnwindPlan is sourced from the compiler: no.
> > This UnwindPlan is valid at all instruction locations: yes.
> > Address range of this UnwindPlan: [ld-linux-x86-64.so.2..text +
> 88576-0x0000000000015a70)
> > row[0]:    0: CFA=rsp +8 => rsp=CFA+0 rip=[CFA-8]
> > row[1]:    4: CFA=rsp+64 => rsp=CFA+0 rip=[CFA-8]
> > row[2]:   94: CFA=rsp -8 => rsp=CFA+0 rip=[CFA-8]
> >
> > eh_frame UnwindPlan:
> > This UnwindPlan originally sourced from eh_frame CFI
> > This UnwindPlan is sourced from the compiler: yes.
> > This UnwindPlan is valid at all instruction locations: no.
> > Address range of this UnwindPlan: [ld-linux-x86-64.so.2..text +
> 88576-0x0000000000015a61)
> > row[0]:    0: CFA=rsp+24 => rip=[CFA-8]
> > row[1]:    4: CFA=rsp+80 => rip=[CFA-8]
> > row[2]:   94: CFA=rsp +8 => rip=[CFA-8]
> >
> >
> >
> > On Wed, Jun 1, 2016 at 11:38 PM, Jason Molenda <jmolenda at apple.com>
> wrote:
> > It gets so tricky!  It's hard for the unwinder to tell the difference
> between a real valid stack unwind and random data giving lots of "frames".
> >
> > It sounds like the problem that needs fixing is to figure out why the
> assembly unwind is wrong for frame 0.  What do you get for
> >
> > disass -a <address inside function>
> >
> > image show-unwind -a <address inside function>
> >
> > ?
> >
> >
> > > On Jun 1, 2016, at 12:56 AM, Ravitheja Addepally <
> ravithejawork at gmail.com> wrote:
> > >
> > > Ok , currently the problem that I am facing is that there are cases in
> which eh_frame should have been used for frame 0 but it isn't and the
> assembly unwind just gives wrong information which could only be detected
> if the debugger tried to extract more frames. Now the usage of
> AddOneMoreFrame in UnwindLLDB is to try to get more than one frames in the
> stack. I want to run both the unwinders and select the one that gives more
> number of frames.
> > >
> > > On Wed, Jun 1, 2016 at 12:27 AM, Jason Molenda <jmolenda at apple.com>
> wrote:
> > >
> > > > On May 31, 2016, at 11:31 AM, jingham at apple.com wrote:
> > > >
> > > >
> > > >> On May 31, 2016, at 12:52 AM, Ravitheja Addepally via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> > > >>
> > > >> Hello,
> > > >>      I posted this query a while ago, i still have no answers, I am
> currently working on Bug 27687 (PrintStackTraces), so the reason for the
> failure is the erroneous unwinding of the frames from the zeroth frame. The
> error is not detected in AddOneMoreFrame, since it only checks for 2 more
> frames, if it was checking more frames in AddOneMoreFrame, it would have
> detected the error. Now my questions are ->
> > > >>
> > > >> ->  is that is there any specific reason for only checking 2 frames
> instead of more ?
> > > >
> > > > The stepping machinery uses the unwinder on each stop to figure out
> whether it has stepped in or out, which is fairly performance sensitive, so
> we don't want AddOneMoreFrame to do more work than it has to.
> > >
> > >
> > > Most common case for a bad unwind, where the unwinder is stuck in a
> loop, is a single stack frame repeating.  I've seen loops as much as six
> frames repeating (which are not actually a series of recursive calls) but
> it's less common.
> > >
> > > >
> > > >> ->  Why no make the EH CFI based unwinder the default one and make
> the assembly the fallback ?
> > >
> > >
> > > Sources of unwind information fall into two categories.  They can
> describe the unwind state at every instruction of a function (asynchronous)
> or they can describe the unwind state only at function call boundaries
> (synchronous).
> > >
> > > Think of "asynchronous" here as the fact that the debugger can
> interrupt the program at any point in time.
> > >
> > > Most unwind information is designed for exception handling -- it is
> synchronous, it can only throw an exception in the body of the function, or
> an exception is passed up through it when it is calling another function.
> > >
> > > For exception handling, there is no need/requirement to describe the
> prologue or epilogue instructions, for instance.
> > >
> > > eh_frame (and DWARF's debug_frame from which it derives) splits the
> difference and makes things quite unclear.  It is guaranteed to be correct
> for exception handling -- it is synchronous, and is valid in the middle of
> the function and when it is calling other functions -- but it is a general
> format that CAN be asynchronous if the emitter includes information about
> the prologue or epilogue or mid-function stack changes.  But eh_frame is
> not guaranteed to be that way, and in fact there's no way for it to
> indicate what it describes, beyond the required unwind info for exception
> handling.
> > >
> > > On x86, gcc and clang have always described the prologue unwind info
> in their eh_frame.  gcc has recently started describing the epilogue too
> (clang does not).  There's code in lldb (e.g.
> UnwindAssembly_x86::AugmentUnwindPlanFromCallSite) written by Tong Shen
> when interning at Google which will try to detect if the eh_frame describes
> the prologue and epilogue.  If it does, it will use eh_frame for frame 0.
> If it only describes the prologue, it will use the instruction emulation
> code to add epilogue instructions and use that at frame 0.
> > >
> > >
> > > There are other sources of unwind information similar to eh_frame that
> are only for exception handling.  Tamas added ArmUnwindInfo last year which
> reads the .ARM.exidx unwind tables.  I added compact unwind importing - an
> Apple specific format that uses a single 4-byte word to describe the unwind
> state for each function, which can't describe anything in the
> prologue/epilogue.  These formats definitely can't be used to unwind at
> frame 0 because we could be stopped anywhere in the prologue/epilogue where
> they are not accurate.
> > >
> > >
> > > It's unfortunate that eh_frame doesn't include a way for the producer
> to declare how async the unwind info is, it makes the debugger's job a lot
> more difficult.
> > >
> > >
> > > J
> > >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20160603/4c38b795/attachment-0001.html>


More information about the lldb-dev mailing list