[Lldb-commits] [PATCH] D124198: [LLDB][Unwind] Add stack scanning as fallback unwind plan if no symbol file is available.

Fri Apr 22 09:56:40 PDT 2022

clayborg added a comment.

In D124198#3466889 <https://reviews.llvm.org/D124198#3466889>, @labath wrote:

> The patch description could definitely use more details about the motivation for the change, and a description of how it works. Apart from the fact that it uses the raSearch feature I introduced a while back, I don't know much about how it works (and given the planned changes, I am not going to try to understand that), but I think I can provide some details/thoughts about the motivation.
>
> (IIUC), the motivating case is some code in kernel32.dll, which ends up setting rbp to zero. I don't know why it does that, but I suspect it may have something to do with the windows syscall convention (I know that ebp is used in i386 linux, for instance).  While it would be definitely interesting to understand why it does that, I don't think it really matters -- one can never really exclude the possibility that some code (inline asm perhaps) will be messing with the frame pointer. So, scanning the stack for return addresses as a *last* last resort (the architectural default plan is already a kind of a last resort option) seems like it could be useful, regardless of the architecture (although the actual algorithm may differ somewhat for x86, which stores the return address on the stack directly, and say arm, which uses a link register). And given that the choice is between "showing nothing" and "showing something potentially incorrect", I don't think we need to be extra careful about vetting this address. All the checks Greg mentioned seem fine, but I don't see a compelling reason to implement them (all of them), especially given that they would not be useful in this case -- all we have is a minidump core file with some stack memory and a list of loaded modules (not necessarily the modules themselves).  In this case, all we can do is to check whether the pointer points to some known code region (which is already done IIRC).

We wrote a command in python called "sbt" and we run this when stacks are truncated when loading minidumps. It often gives us pretty bad results and you would be surprised how many things on the stack look like code pointers. So if we do have the ability to do the checks when we do have the code and can try to verify that the thing on the stack is a return address, then it would be great to do so to avoid the noise. And if we are in code that has no symbols and we don't know the start address of the function, it would be VERY helpful to correctly backtrace if we _can_ determine the start address of the current frame's function if we don't know it as then we can correctly parse the prologue. So adding these checks might actually get us back on track and allow us not to have to use the RA search. The flow would be:

- enable RA search in a frame that has no function bounds info from the object files or runtime info
- find the right thing on the stack that is the correct RA by finding the instruction before the RA and figuring out what address it calls
- use the new function address to create a real stack frame that uses opcode parsing

So there is real benefit to doing these checks IMHO as it can help us get back on track in live debug sessions. Core file sessions results will vary depending on it we actually have the original object files or not, many times we do not have this info, in which case we would just enable RA search and we might be able to unwind the PC, but not other registers if we can't actually find the prologue or if we don't have stack frame details from breakpad

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124198/new/

https://reviews.llvm.org/D124198