[lldb-dev] regarding [Bug 15671] New: backtrace truncated after assertion failure in inferior
Thirumurthi, Ashok
ashok.thirumurthi at intel.com
Mon May 27 14:09:18 PDT 2013
Hi Jason,
So, this thread is still relevant and reproducible using functionalities/inferior-asserting on platforms where libc.so is compiled with -fomit-frame-pointer.
>>> The only solution I can think of here is if abort()'s eh_frame does provide a saved location for rbp but lldb failed to read it correctly. Else, I have no idea how gdb managed to unwind out of this one.
FYI, the routine RegisterContextLLDB::InitializeNoneZerothFrame calls ReadGPRValue for active_row->GetCFARegister(), which allows m_cfa to be set for frame 1 'abort'. When this routine runs for the mystery frame 2, m_sym_ctx.GetAddressRange comes up empty handed (consistent with gdb's backtrace), so addr_range.GetBaseAddress() is not valid. As a result, m_current_offset is -1, and this routine returns before m_cfa is read, resulting in an invalid frame.
> But in this particular backtrace we've got -fomit-frame-pointer frames using eh_frame, then one function that doesn't have any symbol name or eh_frame entry, and I honestly have no idea how gdb found its way out of that one.
Even if the function for frame 2 doesn't have a symbol name, is it possible that it has an eh_frame entry that we can use?
>>> The only reasonable approach here would be to assume that this frame used a frame pointer (rbp), grab the saved rbp value and try to find the caller's pc based on that -- but that failed.
So, I see the code that executes to handle the case where a function ends with a call instruction, which backs up the PC by one byte. However, ResolveSymbolContextForAddress fails, and SymbolContext::GetAddressRange comes up empty handed because the member function is 0, so addr_range is not set by this code.
Without a function symbol, is there a way to set m_current_offset so that ReadGPRRegister can read the saved rbp for frame 2? Thanks,
- Ashok
-----Original Message-----
From: lldb-dev-bounces at cs.uiuc.edu [mailto:lldb-dev-bounces at cs.uiuc.edu] On Behalf Of Langmuir, Ben
Sent: Monday, April 08, 2013 10:12 AM
To: Luddy Harrison; Jason Molenda
Cc: lldb-dev at cs.uiuc.edu
Subject: Re: [lldb-dev] regarding [Bug 15671] New: backtrace truncated after assertion failure in inferior
I've updated bugzilla with the output of image show-unwind -n abort. I couldn't attach the output of readelf -wf libc.so.6 (too big) - is there a way to only show info about the abort function? The name 'abort' doesn't appear in the output.
Ben
-----Original Message-----
From: Luddy Harrison [mailto:luddy.harrison at gmail.com]
Sent: Monday, April 08, 2013 6:18 AM
To: Jason Molenda
Cc: Langmuir, Ben; lldb-dev at cs.uiuc.edu
Subject: Re: [lldb-dev] regarding [Bug 15671] New: backtrace truncated after assertion failure in inferior
hi, just to clarify, I regularly write asm with no eh frames or fonction bounds, no .cfi. gdb unwinds my leaf funtions fine. it is my impression that gdb will in the absence of frame info assume that the topmost item on the stack at a trap is a return pc (even though the trapped pc cannot be identified and has invalid rbp, so disasm of the leaf itself is not possible
put differently if one can't figure out the leaf one can grope for the return pc on the stack and try again at the caller. if the teturn pc points just after a plausible-looking call insn then you're good. hope that makes sense...
Sent from my iPhone
On 8 Apr, 2013, at 17:43, Jason Molenda <jason at molenda.com> wrote:
> Yeah, lldb uses similar tricks. If you have eh_frame instructions, unwinding from -fomit-frame-pointer code is easy. And if you have accurate function bounds for all the frames, lldb can usually manage to unwind an -fomit-frame-pointer stack without eh_frame (because it inspects the actual assembly instructions in the prologue to understand the stack setup). But in this particular backtrace we've got -fomit-frame-pointer frames using eh_frame, then one function that doesn't have any symbol name or eh_frame entry, and I honestly have no idea how gdb found its way out of that one. The only reasonable approach here would be to assume that this frame used a frame pointer (rbp), grab the saved rbp value and try to find the caller's pc based on that -- but that failed.
>
> Well, maybe the additional information from Ben (the eh_frame instructions for abort() most importantly) will provide a hint. The only thing I can think is that maybe lldb misinterpreted that function's eh_frame instructions.
>
> J
>
>
> On Apr 8, 2013, at 1:20 AM, Luddy Harrison wrote:
>
>> having done lots of asm debugging with gdb, I can offer a guess. gdb seems to able to unwind frameless leaf functions with no unwind info. so perhaps as a final fallback it pops the top entry on the stack and treats it as the return pc. if it can unwind the caller using that pc, the it is good.
>>
>> just a guess...
>>
>> -Luddy
>>
>> Sent from my iPhone
>>
>> On 8 Apr, 2013, at 6:01, Jason Molenda <jason at molenda.com> wrote:
>>
>>> I see what's going on here.
>>>
>>> /lib/x86_64-linux-gnu/libc.so.6 was built -fomit-frame-pointer, and it includes eh_frame instructions on how to unwind the frames. But when lldb gets to
>>>
>>> #2 0x00007ffff7a4a0ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6
>>>
>>> it doesn't have any eh_frame instructions. lldb can figure out the stack pointer value (from frame 1) which tells us the "bottom" of this stack frame but it can't find the "top" without eh_frame unwind instructions or knowing what function it is in so it can do an assembly instruction scan to understand how the stack frame was set up. lldb tries to get a saved frame pointer (rbp) which would give us the "top" of the stack frame but the saved rbp value it gets (0x40067e0) is obviously invalid.
>>>
>>> It might be interesting to see the output of
>>>
>>> image show-unwind -n abort
>>>
>>> to see exactly what the eh_frame instructions read (this is lldb's interpretation of the eh_frame instructions, of course, it might be useful to include the output of readelf -wf libc.so.6 or readelf -wF libc.so.6 for the abort() function, going by a web page for readelf I found on the web.) The log output included this,
>>>
>>> th1/fr0 supplying caller's saved reg 16's location, cached
>>> th1/fr1 requested caller's saved PC but this UnwindPlan uses a RA reg; getting reg 16 instead
>>> th1/fr1 supplying caller's saved reg 16's location using eh_frame CFI UnwindPlan
>>> th1/fr1 supplying caller's register 16 from the stack, saved at CFA plus offset
>>> th1/fr2 pc = 0x00007f216e4850ee
>>>
>>> That bit about "this UnwindPlan uses a RA reg" is novel for x86 code, it's normally you see in arm code where the caller's saved pc value is in the link register on a function call. But as you'd guess from the name abort(), this may have the caller's register context saved in an unusual way so this may be fine.
>>>
>>> I'm surprised gdb can unwind this successfully.
>>>
>>> As I alluded to above, lldb can profile the assembly language instructions of a function to understand the prologue setup (where registers are saved, how the stack is set up, etc.) -- but to do this, it needs to know the start address of the function. This "#2 0x00007ffff7a4a0ee in ?? ()" frame clearly doesn't have any symbolic information with its address range so lldb can't do its assembly scan. And it doesn't have eh_frame instructions to help either.
>>>
>>> On Mac OS X we're often working with binaries that have had most of their symbols stripped. Because it is so valuable to lldb to have accurate function ranges, we supplement the symbol table with two sources: The LC_FUNCTION_STARTS section, and barring that (this is new), the eh_frame section. LC_FUNCTION_STARTS is an array of LEB128 encoded offsets of all the start addresses of the functions in the file. The first function is at offset 0, etc. It's real compact, typically a few bytes per function. The eh_frame section is another great source of function bounds information but it tends to be larger and slower to parse through. lldb adds fake symbol names for these function ranges that it adds, e.g. a fake symbol added to the program Dock might be "__lldb_unnamed_function3491$$Dock".
>>>
>>> Of course, given that lldb couldn't find eh_frame instructions for "#2 0x00007ffff7a4a0ee in ?? ()", maybe even that wouldn't have helped.
>>>
>>>
>>> The only solution I can think of here is if abort()'s eh_frame does provide a saved location for rbp but lldb failed to read it correctly. Else, I have no idea how gdb managed to unwind out of this one.
>>>
>>>
>>> On Apr 7, 2013, at 5:46 AM, Langmuir, Ben wrote:
>>>
>>>> Done.
>>>>
>>>> -----Original Message-----
>>>> From: Jason Molenda [mailto:jason at molenda.com]
>>>> Sent: Sunday, April 07, 2013 5:50 AM
>>>> To: Langmuir, Ben
>>>> Subject: regarding [Bug 15671] New: backtrace truncated after assertion failure in inferior
>>>>
>>>> I don't know if I have a bugzilla account on llvm.org (I should but I don't know what password it might have) but I wanted to ask you to do
>>>>
>>>> (lldb) log enable lldb unwind
>>>> (lldb) run
>>>> (lldb) bt
>>>>
>>>>
>>>> and attach that output to http://llvm.org/bugs/show_bug.cgi?id=15671
>>>>
>>>> lldb should use a DefaultUnwindPlan for frame 2 ("?? ()" in gdb's backtrace) to continue the unwind. I don't have linux installed on any devices so I haven't looked but the output will probably be a good clue as to why the unwind stopped early.
>>>>
>>>>
>>>>
>>>> J
>>>
>>>
>>> _______________________________________________
>>> lldb-dev mailing list
>>> lldb-dev at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>
_______________________________________________
lldb-dev mailing list
lldb-dev at cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
More information about the lldb-dev
mailing list