[lldb-dev] Problem unwinding from inside of a CRT function

Fri Jan 16 14:53:07 PST 2015

> On Jan 16, 2015, at 1:42 PM, Reid Kleckner <rnk at google.com> wrote:
> 
> On Fri, Jan 16, 2015 at 12:44 PM, <jingham at apple.com> wrote:

>> 2) If the single step pushes a frame, and we are "stepping over", lldb sets a breakpoint on the return address and continues.  When the return address is hit (for the current frame of course since it could be hit recursively) then we continue stepping as above.
> 
> Any objection to asking the target if the previous opcode is something typically used for a call (x86 call, ARM bl), single step, and then load the retaddr or link register? Is that hard to thread through? I suppose it would fire on 32-bit x86 PIC sequences (call 0 ; pop %ebx), but that won't hurt.

This is Jim's area so I should let him reply.  Jim described earlier how when lldb is instruction stepping over an address range, it uses the disassembler to identify instructions that may branch and uses breakpoints to execute between those (so we don't need to single instruction step the entire range).  This is a relatively new feature to the stepper code -- maybe added within the last year -- and it's the first time that the ThreadPlan type algorithms had access to that knowledge.

Now that we have a disassembler at our disposal, it is reasonable to ask if the disassembler could flag function call instructions and the ThreadPlan could know to single instruction step and provide a hint to the unwinder "Hey, we just stepped into a new function, we haven't executed any instructions in it yet".

The ABI provides a special unwind plan for exactly these scenarios -- CreateFunctionEntryUnwindPlan() -- so all the pieces are likely available.  It's a matter of plumbing it all together across the layers.

The reason this hasn't been necessary to-date is that all of the platforms lldb operates on, it has the addresses of all the functions and stubs/trampolines (CRT functions, PLT routines) in the binaries, or it has unwind information (eh_frame instructions) that tell it how to unwind from address ranges even if it doesn't necessarily know the start address of the functions.  

This makes the impetus to add call/bl knowledge to the ThreadPlans a lot less important - it's not fixing a problem that any of us are seeing today.

And even if we do the work of identifying the "step into a new function" sequence during stepping, we STILL need to be able to unwind from arbitrary stop locations in your process accurately.  You may interrupt the process at any instruction location -- or the program may crash at nearly any instruction location -- and you need to be able to backtrace out of there.  Even if there are functions in the middle of the stack that don't use the frame pointer.  Even if you're sitting at the first instruction of a CRT function or you're in the middle of a frameless leaf function.

In my opinion, expending a lot of energy on making the ThreadPlans know how to unwind from the first instruction is ignoring the real problem of being able to unwind accurately from all instruction locations.  It's not worth doing.  Make the unwinder work from any location on your platform - if it can't, that's the problem that needs to be fixed.  I agree I think it would be interesting if the ThreadPlans could identify to the Unwinder that it has just stepped in to a function for even better reliability in a particularly tricky unwind scenario.  But it's not a panacea, if that's the only thing you fix and rely on "walk the frame chain on the stack" to backtrace, you're going to have a horrible debugger experience.  Even if you can accurately walk the stack you won't get register save locations, for instance, so when the debug info says a variable is stored in rbx in the middle of the stack, and rbx was saved by the callee function to the stack, you won't be able to retrieve it for the user.

J