[lldb-dev] A problem with the arm64 unwind plans I'm looking at
Tamas Berghammer via lldb-dev
lldb-dev at lists.llvm.org
Mon Nov 7 08:10:36 PST 2016
Hi Jason,
I thought about this situation when implemented the original branch
following code and haven't been able to come up with a really good solution.
My only idea is the same what you mentioned. We should try to recognize all
unconditional branches and returns (but not calls) and then if the
following instruction don't have any unwind information yet (e.g. haven't
been a branch target so far) then we try to find some reasonable unwind
info from the previous lines.
The difficult question is how to find the correct information. One possible
heuristic I have in mind is to try to find any call instruction inside the
function before the current PC and use the unwind info from there. The
reason I like this heuristic because there won't be a call instruction
inside the prologue or epilogue and on ARM based on the ABI every call
instruction have to have the same unwind info. Other possible alternative
(or if we don't have a call instruction) is to use the unwind info line
with the information about the highest number of registers. If multiple
lines have the same number of information then either use the earliest one
or the one with the fewest registers being set to IsSame to avoid picking
something from an epilogue.
I don't think any of my suggestions are really good but I don't have any
better idea at the moment.
Tamas
On Sat, Nov 5, 2016 at 3:01 AM Jason Molenda <jmolenda at apple.com> wrote:
> Hi Tamas & Pavel, I thought you might have some ideas so I wanted to show
> a problem I'm looking at right now. The arm64 instruction unwinder
> forwards the unwind state based on branch instructions within the
> function. So if one block of code ends in an epilogue, the next
> instruction (which is presumably a branch target) will have the correct
> original unwind state. This change went in to
> UnwindAssemblyInstEmulation.cpp mid-2015 in r240533 - the code it replaced
> was poorly written, we're better off with this approach.
>
> However I'm looking at a problem where clang will come up with a branch
> table for a bunch of case statements. e.g. this function:
>
> 0x100007df0 <+0>: stp x22, x21, [sp, #-0x30]!
> 0x100007df4 <+4>: stp x20, x19, [sp, #0x10]
> 0x100007df8 <+8>: stp x29, x30, [sp, #0x20]
> 0x100007dfc <+12>: add x29, sp, #0x20 ; =0x20
> 0x100007e00 <+16>: sub sp, sp, #0x10 ; =0x10
> 0x100007e04 <+20>: mov x19, x1
> 0x100007e08 <+24>: mov x20, x0
> 0x100007e0c <+28>: add w21, w20, w20, lsl #2
> 0x100007e10 <+32>: bl 0x100007f58 ; symbol stub
> for: getpid
> 0x100007e14 <+36>: add w0, w0, w21
> 0x100007e18 <+40>: mov w8, w20
> 0x100007e1c <+44>: cmp w20, #0x1d ; =0x1d
> 0x100007e20 <+48>: b.hi 0x100007e4c ; <+92> at a.c:112
> 0x100007e24 <+52>: adr x9, #0x90 ; switcher + 196
> 0x100007e28 <+56>: nop
> 0x100007e2c <+60>: ldrsw x8, [x9, x8, lsl #2]
> 0x100007e30 <+64>: add x8, x8, x9
> 0x100007e34 <+68>: br x8
> 0x100007e38 <+72>: sub sp, x29, #0x20 ; =0x20
> 0x100007e3c <+76>: ldp x29, x30, [sp, #0x20]
> 0x100007e40 <+80>: ldp x20, x19, [sp, #0x10]
> 0x100007e44 <+84>: ldp x22, x21, [sp], #0x30
> 0x100007e48 <+88>: ret
> 0x100007e4c <+92>: add w0, w0, #0x1 ; =0x1
> 0x100007e50 <+96>: b 0x100007e38 ; <+72> at a.c:115
> 0x100007e54 <+100>: orr w8, wzr, #0x7
> 0x100007e58 <+104>: str x8, [sp, #0x8]
> 0x100007e5c <+108>: sxtw x8, w19
> 0x100007e60 <+112>: str x8, [sp]
> 0x100007e64 <+116>: adr x0, #0x148 ; "%c %d\n"
> 0x100007e68 <+120>: nop
> 0x100007e6c <+124>: bl 0x100007f64 ; symbol stub
> for: printf
> 0x100007e70 <+128>: sub sp, x29, #0x20 ; =0x20
> 0x100007e74 <+132>: ldp x29, x30, [sp, #0x20]
> 0x100007e78 <+136>: ldp x20, x19, [sp, #0x10]
> 0x100007e7c <+140>: ldp x22, x21, [sp], #0x30
> 0x100007e80 <+144>: b 0x100007f38 ; f3 at b.c:4
> 0x100007e84 <+148>: sxtw x8, w19
> 0x100007e88 <+152>: str x8, [sp]
> 0x100007e8c <+156>: adr x0, #0x127 ; "%c\n"
> 0x100007e90 <+160>: nop
> 0x100007e94 <+164>: bl 0x100007f64 ; symbol stub
> for: printf
> 0x100007e98 <+168>: bl 0x100007f40 ; f4 at b.c:7
> 0x100007e9c <+172>: sxtw x8, w19
> 0x100007ea0 <+176>: str x8, [sp]
> 0x100007ea4 <+180>: adr x0, #0x10f ; "%c\n"
> 0x100007ea8 <+184>: nop
> 0x100007eac <+188>: bl 0x100007f64 ; symbol stub
> for: printf
> 0x100007eb0 <+192>: bl 0x100007f4c ; symbol stub
> for: abort
>
>
> It loads data from the jump table and branches to the correct block in the
> +52 .. +68 instructions. We have epilogues at 88, 144, and 192. And we
> get an unwind plan like
>
> row[0]: 0: CFA=sp +0 =>
> row[1]: 4: CFA=sp+48 => x21=[CFA-40] x22=[CFA-48]
> row[2]: 8: CFA=sp+48 => x19=[CFA-24] x20=[CFA-32] x21=[CFA-40]
> x22=[CFA-48]
> row[3]: 12: CFA=sp+48 => x19=[CFA-24] x20=[CFA-32] x21=[CFA-40]
> x22=[CFA-48] fp=[CFA-16] lr=[CFA-8]
> row[4]: 20: CFA=sp+64 => x19=[CFA-24] x20=[CFA-32] x21=[CFA-40]
> x22=[CFA-48] fp=[CFA-16] lr=[CFA-8]
> row[5]: 80: CFA=sp+64 => x19=[CFA-24] x20=[CFA-32] x21=[CFA-40]
> x22=[CFA-48] fp= <same> lr= <same>
> row[6]: 84: CFA=sp+64 => x19= <same> x20= <same> x21=[CFA-40]
> x22=[CFA-48] fp= <same> lr= <same>
> row[7]: 88: CFA=sp +0 => x19= <same> x20= <same> x21= <same> x22= <same>
> fp= <same> lr= <same>
> row[8]: 92: CFA=sp+64 => x19=[CFA-24] x20=[CFA-32] x21=[CFA-40]
> x22=[CFA-48] fp=[CFA-16] lr=[CFA-8]
> row[9]: 108: CFA=sp+64 => x8=[CFA-56] x19=[CFA-24] x20=[CFA-32]
> x21=[CFA-40] x22=[CFA-48] fp=[CFA-16] lr=[CFA-8]
> row[10]: 136: CFA=sp+64 => x8=[CFA-56] x19=[CFA-24] x20=[CFA-32]
> x21=[CFA-40] x22=[CFA-48] fp= <same> lr= <same>
> row[11]: 140: CFA=sp+64 => x8=[CFA-56] x19= <same> x20= <same>
> x21=[CFA-40] x22=[CFA-48] fp= <same> lr= <same>
> row[12]: 144: CFA=sp +0 => x8=[CFA-56] x19= <same> x20= <same> x21=
> <same> x22= <same> fp= <same> lr= <same>
>
> where we have no unwind state for the range 148..192 (I complicated it a
> little by calling a noreturn function that ended up being the last one --
> that's why it doesn't do an epilogue sequence at the very end of the
> function).
>
>
> I'm not sure how we should address this one - our branch-target approach
> can't do the right thing here, there is no indication (for lldb) of the
> branch from instruction +68 to +148. Should we recognize "ret" and "b"
> (with a range outside the bounds of the current function) as epilogues and
> try to find something that looks like valid unwind state from earlier in
> the function body?
>
>
> What actually happens right now (if lldb is stopped in the range 148 -
> 192) is that we find a LR with a save status of IsSame which
> RegisterContextLLDB rejects as impossible (yes, this IS actually possible -
> if we're stopped in the range of +80 - +88 then the return address is in
> the lr and the ret instruction will use it when we get there - but let's
> ignore that separate bug for now) is that we reject the unwind plan because
> of the lr being IsSame and we fall back to the ABI's architectural default
> unwind plan which does work in this case, although we won't know about the
> x19 & x20 register spills to the stack.
>
> This is a synthetic example, of course, the real place where I hit this
> issue is in some apple internal code, but it's the same instruction pattern.
>
> fwiw I reproduced it with these two source files linked together with
> clang and -Os optimization; gcc is unlikely to generate the same code.
> It's probably possible to reduce the test case further but I'll add a unit
> test of the raw instructions once we figure out how to handle this.
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20161107/1bab0214/attachment-0001.html>
More information about the lldb-dev
mailing list