[lldb-dev] Inquiry regarding AddOneMoreFrame function in UnWindLLDB

Jason Molenda via lldb-dev lldb-dev at lists.llvm.org
Thu Jun 2 12:58:24 PDT 2016


This has no eh_frame unwind instructions.  Even if we were using eh_frame at frame 0, you'd be out of luck.

I forget the exact order of fallbacks.  I think for frame 0 we try to use the assembly profile unwind ("async unwind plan") and if we can't do that we fall back to the eh_frame unwind ("sync unwind plan") and as a last resort we'll use the architecture default unwind plan.  Which, for a stack frame like this that doesn't do the usual push rbp; mov rsp, rbp sequence, means we'll skip at least one stack frame.

The assembly inspection unwind plan from AssemblyParse_x86 looks correct to me. This function saves some register on the stack (all of them argument or volatile registers, so that's weird & the assembly profiler probably won't record them; whatever), calls a function, restores the reg values and then jumps to the returned function pointer from that first func call.  Maybe this is some dynamic loader fixup routine for the first time an external function is called and the solib needs to be paged in.

You're stopped in the body of the function (offset 86) where the stack pointer is still as expected.  I'd have to think about that unwind entry for offset +94 (if you were stopped on the jmp instruction) a bit more - that's a bit unusual.  But unless you're on the jmp, I can't see this unwind going wrong.


J

> On Jun 2, 2016, at 1:48 AM, Ravitheja Addepally <ravithejawork at gmail.com> wrote:
> 
> Hello,
>          This is happening in TestPrintStackTraces, where we can end up here:
> ld-linux-x86-64.so.2`___lldb_unnamed_symbol95$$ld-linux-x86-64.so.2:
>     0x7ffff7df04e0 <+0>:  48 83 ec 38                                   subq   $0x38, %rsp
>     0x7ffff7df04e4 <+4>:  48 89 04 24                                   movq   %rax, (%rsp)
>     0x7ffff7df04e8 <+8>:  48 89 4c 24 08                                movq   %rcx, 0x8(%rsp)
>     0x7ffff7df04ed <+13>: 48 89 54 24 10                                movq   %rdx, 0x10(%rsp)
>     0x7ffff7df04f2 <+18>: 48 89 74 24 18                                movq   %rsi, 0x18(%rsp)
>     0x7ffff7df04f7 <+23>: 48 89 7c 24 20                                movq   %rdi, 0x20(%rsp)
>     0x7ffff7df04fc <+28>: 4c 89 44 24 28                                movq   %r8, 0x28(%rsp)
>     0x7ffff7df0501 <+33>: 4c 89 4c 24 30                                movq   %r9, 0x30(%rsp)
>     0x7ffff7df0506 <+38>: 48 8b 74 24 40                                movq   0x40(%rsp), %rsi
>     0x7ffff7df050b <+43>: 48 8b 7c 24 38                                movq   0x38(%rsp), %rdi
>     0x7ffff7df0510 <+48>: e8 4b 8f ff ff                                callq  0x7ffff7de9460            ; ___lldb_unnamed_symbol54$$ld-linux-x86-64.so.2
>     0x7ffff7df0515 <+53>: 49 89 c3                                      movq   %rax, %r11
>     0x7ffff7df0518 <+56>: 4c 8b 4c 24 30                                movq   0x30(%rsp), %r9
>     0x7ffff7df051d <+61>: 4c 8b 44 24 28                                movq   0x28(%rsp), %r8
>     0x7ffff7df0522 <+66>: 48 8b 7c 24 20                                movq   0x20(%rsp), %rdi
>     0x7ffff7df0527 <+71>: 48 8b 74 24 18                                movq   0x18(%rsp), %rsi
>     0x7ffff7df052c <+76>: 48 8b 54 24 10                                movq   0x10(%rsp), %rdx
>     0x7ffff7df0531 <+81>: 48 8b 4c 24 08                                movq   0x8(%rsp), %rcx
> ->  0x7ffff7df0536 <+86>: 48 8b 04 24                                   movq   (%rsp), %rax
>     0x7ffff7df053a <+90>: 48 83 c4 48                                   addq   $0x48, %rsp
>     0x7ffff7df053e <+94>: 41 ff e3                                      jmpq   *%r11
>     0x7ffff7df0541 <+97>: 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00  nopw   %cs:(%rax,%rax)
> 
> 
> image show-unwind --address 0x7ffff7df0536
> UNWIND PLANS for ld-linux-x86-64.so.2`___lldb_unnamed_symbol95$$ld-linux-x86-64.so.2 (start addr 0x7ffff7df04e0)
> 
> Asynchronous (not restricted to call-sites) UnwindPlan is 'assembly insn profiling'
> Synchronous (restricted to call-sites) UnwindPlan is 'eh_frame CFI'
> 
> Assembly language inspection UnwindPlan:
> This UnwindPlan originally sourced from assembly insn profiling
> This UnwindPlan is sourced from the compiler: no.
> This UnwindPlan is valid at all instruction locations: yes.
> Address range of this UnwindPlan: [ld-linux-x86-64.so.2..text + 88576-0x0000000000015a70)
> row[0]:    0: CFA=rsp +8 => rsp=CFA+0 rip=[CFA-8] 
> row[1]:    4: CFA=rsp+64 => rsp=CFA+0 rip=[CFA-8] 
> row[2]:   94: CFA=rsp -8 => rsp=CFA+0 rip=[CFA-8] 
> 
> eh_frame UnwindPlan:
> This UnwindPlan originally sourced from eh_frame CFI
> This UnwindPlan is sourced from the compiler: yes.
> This UnwindPlan is valid at all instruction locations: no.
> Address range of this UnwindPlan: [ld-linux-x86-64.so.2..text + 88576-0x0000000000015a61)
> row[0]:    0: CFA=rsp+24 => rip=[CFA-8] 
> row[1]:    4: CFA=rsp+80 => rip=[CFA-8] 
> row[2]:   94: CFA=rsp +8 => rip=[CFA-8]
> 
> 
> 
> On Wed, Jun 1, 2016 at 11:38 PM, Jason Molenda <jmolenda at apple.com> wrote:
> It gets so tricky!  It's hard for the unwinder to tell the difference between a real valid stack unwind and random data giving lots of "frames".
> 
> It sounds like the problem that needs fixing is to figure out why the assembly unwind is wrong for frame 0.  What do you get for
> 
> disass -a <address inside function>
> 
> image show-unwind -a <address inside function>
> 
> ?
> 
> 
> > On Jun 1, 2016, at 12:56 AM, Ravitheja Addepally <ravithejawork at gmail.com> wrote:
> >
> > Ok , currently the problem that I am facing is that there are cases in which eh_frame should have been used for frame 0 but it isn't and the assembly unwind just gives wrong information which could only be detected if the debugger tried to extract more frames. Now the usage of AddOneMoreFrame in UnwindLLDB is to try to get more than one frames in the stack. I want to run both the unwinders and select the one that gives more number of frames.
> >
> > On Wed, Jun 1, 2016 at 12:27 AM, Jason Molenda <jmolenda at apple.com> wrote:
> >
> > > On May 31, 2016, at 11:31 AM, jingham at apple.com wrote:
> > >
> > >
> > >> On May 31, 2016, at 12:52 AM, Ravitheja Addepally via lldb-dev <lldb-dev at lists.llvm.org> wrote:
> > >>
> > >> Hello,
> > >>      I posted this query a while ago, i still have no answers, I am currently working on Bug 27687 (PrintStackTraces), so the reason for the failure is the erroneous unwinding of the frames from the zeroth frame. The error is not detected in AddOneMoreFrame, since it only checks for 2 more frames, if it was checking more frames in AddOneMoreFrame, it would have detected the error. Now my questions are ->
> > >>
> > >> ->  is that is there any specific reason for only checking 2 frames instead of more ?
> > >
> > > The stepping machinery uses the unwinder on each stop to figure out whether it has stepped in or out, which is fairly performance sensitive, so we don't want AddOneMoreFrame to do more work than it has to.
> >
> >
> > Most common case for a bad unwind, where the unwinder is stuck in a loop, is a single stack frame repeating.  I've seen loops as much as six frames repeating (which are not actually a series of recursive calls) but it's less common.
> >
> > >
> > >> ->  Why no make the EH CFI based unwinder the default one and make the assembly the fallback ?
> >
> >
> > Sources of unwind information fall into two categories.  They can describe the unwind state at every instruction of a function (asynchronous) or they can describe the unwind state only at function call boundaries (synchronous).
> >
> > Think of "asynchronous" here as the fact that the debugger can interrupt the program at any point in time.
> >
> > Most unwind information is designed for exception handling -- it is synchronous, it can only throw an exception in the body of the function, or an exception is passed up through it when it is calling another function.
> >
> > For exception handling, there is no need/requirement to describe the prologue or epilogue instructions, for instance.
> >
> > eh_frame (and DWARF's debug_frame from which it derives) splits the difference and makes things quite unclear.  It is guaranteed to be correct for exception handling -- it is synchronous, and is valid in the middle of the function and when it is calling other functions -- but it is a general format that CAN be asynchronous if the emitter includes information about the prologue or epilogue or mid-function stack changes.  But eh_frame is not guaranteed to be that way, and in fact there's no way for it to indicate what it describes, beyond the required unwind info for exception handling.
> >
> > On x86, gcc and clang have always described the prologue unwind info in their eh_frame.  gcc has recently started describing the epilogue too (clang does not).  There's code in lldb (e.g. UnwindAssembly_x86::AugmentUnwindPlanFromCallSite) written by Tong Shen when interning at Google which will try to detect if the eh_frame describes the prologue and epilogue.  If it does, it will use eh_frame for frame 0.  If it only describes the prologue, it will use the instruction emulation code to add epilogue instructions and use that at frame 0.
> >
> >
> > There are other sources of unwind information similar to eh_frame that are only for exception handling.  Tamas added ArmUnwindInfo last year which reads the .ARM.exidx unwind tables.  I added compact unwind importing - an Apple specific format that uses a single 4-byte word to describe the unwind state for each function, which can't describe anything in the prologue/epilogue.  These formats definitely can't be used to unwind at frame 0 because we could be stopped anywhere in the prologue/epilogue where they are not accurate.
> >
> >
> > It's unfortunate that eh_frame doesn't include a way for the producer to declare how async the unwind info is, it makes the debugger's job a lot more difficult.
> >
> >
> > J
> >
> 
> 



More information about the lldb-dev mailing list