[lldb-dev] eh_frame or debug_frame

Ryan Brown ribrdb at google.com
Wed Oct 15 14:59:17 PDT 2014


rolling back r219772 (Be more consistent about null checks for the Process
and ABI in GetFullUnwindPlanForFrame) doesn't seem to have any effect.

-- Ryan Brown

On Wed, Oct 15, 2014 at 2:31 PM, Jason Molenda <jmolenda at apple.com> wrote:

>
> > On Oct 15, 2014, at 1:43 PM, Ryan Brown <ribrdb at google.com> wrote:
> >
> > Go doesn't have exception handlers, so it doesn't write .eh_frame.
> Wouldn't it make sense to use .debug_frame if .eh_frame is missing?
>
>
> We could do that.  I'm surprised if go is emitting x86_64 code without
> eh_frame.  As Joerg points out, debug_frame is great but it may not be
> available when an analysis tool is examining a binary.  eh_frame has the
> benefit of always being in the binary.
>
> >
> > With my custom RegisterContext I got backtraces to work for my memory
> threads. But something strange is going on. I have 10 threads that should
> have identical traces, but the first has 5 frames, then 4, 3, 2, and the
> rest only have 1 frame.
>
>
> It's easiest to isolate one thread backtrace in a situation like this.
> For instance, looking at thread 7 in your program.  (the unwind algorithms
> have no cross-thread information passing):
>
>
> th7/fr0 initialized frame current pc is 0xdaef cfa is 0x20809feb8 using
> assembly insn profiling UnwindPlan
>
> lldb is using the assembly unwind inspection for frame 0.  You said that
> all ten threads should have the same backtrace but thread #2 is at 0x2fe8c,
> #3 is at 0x209a, threads 4-15 are at 0xdaef.  You meant threads 4-15 should
> all be the same.
>
>
>      th7/fr5 pc = 0x0000000000002078
>      th7/fr5 fp = 0xffffffffffffffff
>     th7/fr4 supplying caller's stack pointer (7) value, computed from CFA
>      th7/fr5 sp = 0x000000020809ffc8
>      th7/fr5 active row: 0x0000000000002050: CFA=rbp+16 => rbp=[rbp]
> rsp=rbp+16 rip=[rbp+8]
>
> That's the architectural default unwind plan for x86_64 ABIs.  Over in
> thread 6, it looks like failed to unwind past frame 5 with the assembly
> unwind, figured the assembly unwind was incorrect, and tried switching over
> to using the architectural default unwind plan:
>
> th6/fr0 supplying caller's saved reg 6's location, cached
>      th6/fr5 full unwind plan 'assembly insn profiling' has been replaced
> by architecture default unwind plan 'x86_64 default unwind plan' for this
> function from now on.
>      th6/fr5 supplying caller's saved reg 16's location using x86_64
> default unwind plan UnwindPlan
>      th6/fr5 supplying caller's register 16 from the stack, saved at CFA
> plus offset -8
>       th6/fr6 could not get pc value
>       Frame 6 invalid RegisterContext for this frame, stopping stack walk
> th6 Unwind of this thread is complete.
>
> From this point forward main.okread() will use the arch default unwind
> plan which isn't going to work.
>
> Can you try rolling back r219772 and seeing if that helps?  I suspect lldb
> may be slowly stripping off the last frame of the unwind for each thread as
> it progresses.
>
> J
>
> PS- "bt all" works just as well as "thread backtrace all".
>
>
> >
> > There's a log here, thread 6 is the one with the complete backtrace.
> https://gist.github.com/ribrdb/386fb0e555e82483d21d
> >
> > Comparing thread 7 with thread 6, things seem fine up to line 627:
> >     th7/fr4 supplying caller's stack pointer (7) value, computed from CFA
> >      th7/fr5 sp = 0x000000020809ffc8
> >      th7/fr5 active row: 0x0000000000002050: CFA=rbp+16 => rbp=[rbp]
> rsp=rbp+16 rip=[rbp+8]
> >
> > While thread 6 has:
> >      th6/fr4 supplying caller's stack pointer (7) value, computed from
> CFA
> >      th6/fr5 sp = 0x000000020809f7c8
> >      th6/fr5 active row: 0x000000000000206a: CFA=rsp+16 => rsp=rsp+16
> rip=[rsp+8]
> >
> > I don't know where rbp came from, it's not in the function at all:
> > 0x2050 <main.okread>: movq   %gs:0x8a0, %rcx
> > 0x2059 <main.okread+9>: cmpq   0x10(%rcx), %rsp
> > 0x205d <main.okread+13>: ja     0x2066                    ; main.okread
> + 22 at test.go:9
> > 0x205f <main.okread+15>: callq  0x2d510                   ;
> runtime.morestack_noctxt at asm_amd64.s:330
> > 0x2064 <main.okread+20>: jmp    0x2050                    ; main.okread
> at test.go:9
> > 0x2066 <main.okread+22>: subq   $0x8, %rsp
> > 0x206a <main.okread+26>: movq   0x10(%rsp), %rbx
> > 0x206f <main.okread+31>: movq   %rbx, (%rsp)
> > 0x2073 <main.okread+35>: callq  0x2000                    ; main.doread
> at test.go:5
> > 0x2078 <main.okread+40>: addq   $0x8, %rsp
> > 0x207c <main.okread+44>: retq
> > 0x207d <main.okread+45>: addb   %al, (%rax)
> >
> >
> >
> >
> >
> > -- Ryan Brown
> >
> > On Wed, Oct 15, 2014 at 11:48 AM, Ryan Brown <ribrdb at google.com> wrote:
> > Yes, I'm writing a class to do that now. It's just not supported by any
> of the existing register contexts.
> >
> > -- Ryan Brown
> >
> > On Wed, Oct 15, 2014 at 11:37 AM, Jason Molenda <jason at molenda.com>
> wrote:
> > Can't your OS plugin for the goroutines use the same sp and ip register
> numbers as x86_64 (instead of 0 and 1 like you might be using right now)
> when it reports them to lldb, and return all the other registers as
> "unavailable" if they're requested?
> >
> > The tricky bit about living on eh_frame / debug_frame is that lldb
> doesn't know what kind of unwind info it is being given.  Is it just for
> exception handling locations?  Does it contain prologue setup?  epilogue?
> Is it fully asynchronous - giving unwind details at all locations?  There
> aren't any flags in eh_frame/debug_frame that could give us a hint about
> what we're working with.
> >
> >
> >
> > On Oct 15, 2014, at 11:24 AM, Ryan Brown <ribrdb at google.com> wrote:
> >
> > > I'm actually struggling with this right now. I'm trying to implement
> an OS plugin so goroutines show up as threads.
> > > The go compiler puts instruction accurate unwind info into
> .debug_frame, I'm not sure what (if anything) goes into eh_frame.
> > > However lldb uses the disassembly instead of the dwarf info. The x86
> unwinder assumes that all threads have the same LLDB register numbers, but
> other parts of the code require that the LLDB register number is < (number
> of registers). Goroutines only store sp and ip, so it seems I'm going to
> have to create a custom RegisterContext subclass to get the existing
> unwinder to work for goroutines.
> > >
> > > On Tue, Oct 14, 2014 at 5:51 PM, Jason Molenda <jmolenda at apple.com>
> wrote:
> > > > On Oct 13, 2014, at 9:55 AM, Greg Clayton <gclayton at apple.com
> > > > wrote:
> > >
> > > >
> > >
> > >
> > > >
> > >
> > >
> > > >> On Oct 10, 2014, at 1:58 PM, Francois Pichet <pichet2000 at
> gmail.com
> > > > wrote:
> > >
> > > >>
> > >
> > >
> > > >>
> > >
> > >
> > > >>
> > >
> > >
> > > >> On Fri, Oct 10, 2014 at 4:20 PM, Greg Clayton <gclayton at
> apple.com
> > > > wrote:
> > >
> > > >>
> > >
> > >
> > > >>> On Oct 10, 2014, at 1:05 PM, Philippe Lavoie <philippe.lavoie at
> octasic.com
> > > > wrote:
> > >
> > > >>>
> > >
> > >
> > > >>>
> > >  Hi,
> > >
> > > >>>
> > >
> > >
> > > >>>
> > >  I noticed that by default lldb does not read .debug_frame section to
> unwind frames but relies instead on .eh_frame .
> > >
> > > >>>
> > >
> > >
> > > >>>
> > >  Is there a way to fallback to reading .debug_frame?
> > >
> > > >>
> > >
> > >
> > > >>
> > >  Not currently. Most compilers (gcc _and_ clang) put the same old
> stuff in .debug_frame as they do in .eh_frame, so we haven't had to use
> .debug_frame over .eh_frame yet. What compiler are using that is putting
> different (more complete) info in .debug_frame vs .eh_frame?
> > >
> > > >>
> > >
> > >
> > > >>
> > >
> > >
> > > >>
> > >  What about about C or C++ program compiled with -fno-exceptions?
> > >
> > > >>
> > >  They will fall back to the UnwindAssembly way even if the
> .debug_frame is present right?
> > >
> > > >
> > >
> > >
> > > >
> > >  If no EH frame exists for a frame, then we will always fall back to
> UnwindAssembly. We always use UnwindAssembly for the first frame and for
> any frame that is past an async interrupt (sigtramp). We use the EH
> frame/.debug_frame for any non-zero frames, but will always use
> UnwindAssembly if there is no such info.
> > >
> > >
> > >
> > > I want to expand on what Greg said earlier about eh_frame versus
> debug_frame.
> > >
> > > Ideally, eh_frame will be the minimal unwind instructions necessary to
> unwind the stack when exceptions are thrown/caught.  eh_frame will not
> include unwind instructions for the prologue instructions or epilogue
> instructions -- because we can't throw an exception there, or have an
> exception thrown from a called function "below" us on the stack.  We call
> these unwind instructions "synchronous" because they only describe the
> unwind state from a small set of locations.
> > >
> > > debug_frame would describe how to unwind the stack at every
> instruction location.  Every instruction of the prologue and epilogue.  If
> the code is built without a frame pointer, then it would have unwind
> instructions at every place where the stack pointer is modified.  We
> describe these unwind instructions as "asynchronous" because they describe
> the unwind state at every instruction location.
> > >
> > >
> > > Instead what we have with gcc and clang is eh_frame instructions that
> describe the prologue (and some versions of gcc, the epilogue) plus the
> unwind state at synchronous unwind locations (where an exception can be
> thrown).  We have a half-way blend of asynchronous and synchronous ... it's
> "pretty good" but not "guaranteed" from a debugger's perspective.  It would
> be great if eh_frame was genuinely only the unwind instructions for
> exception handling and debug_frame had the full unwind state at every
> instruction and we could depend on debug_frame.  But in reality, the same
> unwind instructions are put in both eh_frame and debug_frame -- so there's
> little point in ever reading debug_frame.  lldb does not read debug_frame
> today, although it would be easy to do so.
> > >
> > >
> > > As an experiment starting late August (r216406), lldb is now trying to
> use eh_frame for the currently-executing frame.  Even though it isn't
> *guaranteed* to be accurate at all instructions, in practice it's pretty
> good -- good enough that gdb seems to be able to live on it.  Tong Shen's
> patch in r216406 does augment the eh_frame unwind instructions with the
> epilogue unwind... newer gcc's apparently describe the epilogue in eh_frame
> but few other compilers do.
> > >
> > > It's an open question how well living off eh_frame unwind instructions
> will work with a non-gcc/non-clang compiler.  That's why I say this is an
> "experiment" - we may have to revert to lldb's UnwindAssembly profiling
> code for the currently-executing function if this breaks with other
> compilers.
> > >
> > > J
> > >
> > >
> > > _______________________________________________
> > > lldb-dev mailing list
> > > lldb-dev at cs.uiuc.edu
> > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> >
> >
> >
> > _______________________________________________
> > lldb-dev mailing list
> > lldb-dev at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20141015/bc883723/attachment.html>


More information about the lldb-dev mailing list