[lldb-dev] regarding [Bug 15671] New: backtrace truncated after assertion failure in inferior
jason at molenda.com
Sun Apr 7 15:01:38 PDT 2013
I see what's going on here.
/lib/x86_64-linux-gnu/libc.so.6 was built -fomit-frame-pointer, and it includes eh_frame instructions on how to unwind the frames. But when lldb gets to
#2 0x00007ffff7a4a0ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6
it doesn't have any eh_frame instructions. lldb can figure out the stack pointer value (from frame 1) which tells us the "bottom" of this stack frame but it can't find the "top" without eh_frame unwind instructions or knowing what function it is in so it can do an assembly instruction scan to understand how the stack frame was set up. lldb tries to get a saved frame pointer (rbp) which would give us the "top" of the stack frame but the saved rbp value it gets (0x40067e0) is obviously invalid.
It might be interesting to see the output of
image show-unwind -n abort
to see exactly what the eh_frame instructions read (this is lldb's interpretation of the eh_frame instructions, of course, it might be useful to include the output of readelf -wf libc.so.6 or readelf -wF libc.so.6 for the abort() function, going by a web page for readelf I found on the web.) The log output included this,
th1/fr0 supplying caller's saved reg 16's location, cached
th1/fr1 requested caller's saved PC but this UnwindPlan uses a RA reg; getting reg 16 instead
th1/fr1 supplying caller's saved reg 16's location using eh_frame CFI UnwindPlan
th1/fr1 supplying caller's register 16 from the stack, saved at CFA plus offset
th1/fr2 pc = 0x00007f216e4850ee
That bit about "this UnwindPlan uses a RA reg" is novel for x86 code, it's normally you see in arm code where the caller's saved pc value is in the link register on a function call. But as you'd guess from the name abort(), this may have the caller's register context saved in an unusual way so this may be fine.
I'm surprised gdb can unwind this successfully.
As I alluded to above, lldb can profile the assembly language instructions of a function to understand the prologue setup (where registers are saved, how the stack is set up, etc.) -- but to do this, it needs to know the start address of the function. This "#2 0x00007ffff7a4a0ee in ?? ()" frame clearly doesn't have any symbolic information with its address range so lldb can't do its assembly scan. And it doesn't have eh_frame instructions to help either.
On Mac OS X we're often working with binaries that have had most of their symbols stripped. Because it is so valuable to lldb to have accurate function ranges, we supplement the symbol table with two sources: The LC_FUNCTION_STARTS section, and barring that (this is new), the eh_frame section. LC_FUNCTION_STARTS is an array of LEB128 encoded offsets of all the start addresses of the functions in the file. The first function is at offset 0, etc. It's real compact, typically a few bytes per function. The eh_frame section is another great source of function bounds information but it tends to be larger and slower to parse through. lldb adds fake symbol names for these function ranges that it adds, e.g. a fake symbol added to the program Dock might be "__lldb_unnamed_function3491$$Dock".
Of course, given that lldb couldn't find eh_frame instructions for "#2 0x00007ffff7a4a0ee in ?? ()", maybe even that wouldn't have helped.
The only solution I can think of here is if abort()'s eh_frame does provide a saved location for rbp but lldb failed to read it correctly. Else, I have no idea how gdb managed to unwind out of this one.
On Apr 7, 2013, at 5:46 AM, Langmuir, Ben wrote:
> -----Original Message-----
> From: Jason Molenda [mailto:jason at molenda.com]
> Sent: Sunday, April 07, 2013 5:50 AM
> To: Langmuir, Ben
> Subject: regarding [Bug 15671] New: backtrace truncated after assertion failure in inferior
> I don't know if I have a bugzilla account on llvm.org (I should but I don't know what password it might have) but I wanted to ask you to do
> (lldb) log enable lldb unwind
> (lldb) run
> (lldb) bt
> and attach that output to http://llvm.org/bugs/show_bug.cgi?id=15671
> lldb should use a DefaultUnwindPlan for frame 2 ("?? ()" in gdb's backtrace) to continue the unwind. I don't have linux installed on any devices so I haven't looked but the output will probably be a good clue as to why the unwind stopped early.
More information about the lldb-dev