[Lldb-commits] [PATCH] D70840: [LLDB] [DWARF] Strip out the thumb bit from addresses on ARM

Wed Feb 12 04:50:32 PST 2020

labath added a comment.

In D70840#1871862 <https://reviews.llvm.org/D70840#1871862>, @mstorsjo wrote:

> > Now someone might argue that the looking up the address of the `ldr` opcode is wrong, because that is not the actual pc value, and that we should lookup the address of `ldr+1`. In that case we can point them to the next function (`call_noreturn`), which ends with a call to a noreturn function. Now if we get a crash in `does_not_return()`, the address we will get from the backtrace will be one byte after the last opcode of `call_noreturn` (`bl      does_not_return()`). This means that in the **non-windows** case, it will not resolve correctly to the `call_noreturn` even if we apply the usual trick of subtracting 1 from the address to bring it into the range of the calling function.
>
> Hmm... I presume this issue is present now already (and doesn't really change due to normalizing the line tables and ranges on windows, to the same as linux). Not familiar with "the usual trick of subtracting 1 from the address to bring it into the range of the calling function", but wouldn't something like `GetOpcodeLoadAddress(pc) - 1` always bring you into the range of the calling function, given a return address?

Lldb uses this when unwinding specifically to deal with the "call as a last instruction" problem, and I'm pretty sure it's not the only tool doing that. This is even described in the DWARF spec (non-normative text):

> In most cases the return address is in the same context as the calling address, but that
>  need not be the case, especially if the producer knows in some way the call never will
>  return. The context of the ’return address’ might be on a different line, in a different
>  lexical block, or past the end of the calling subroutine. If a consumer were to assume that
>  it was in the same context as the calling address, the virtual unwind might fail.
> 
> For architectures with constant-length instructions where the return address
>  immediately follows the call instruction, a simple solution is to subtract the length of an
>  instruction from the return address to obtain the calling instruction. For architectures
>  with variable-length instructions (for example, x86), this is not possible. However,
>  subtracting 1 from the return address, although not guaranteed to provide the exact
>  calling address, generally will produce an address within the same context as the calling
>  address, and that usually is sufficient.

Using `GetOpcodeLoadAddress(pc) - 1` would work, but this implies that the caller should be passing in "load" addresses, which is in conflict with the premise I have made at the beginning of the paragraph that one should be passing in "code" addresses here. (My attempt at proof by contradiction.)

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70840/new/

https://reviews.llvm.org/D70840