[lldb-dev] More ARM debugging woes involving the Thumb IT (if/then) instruction...

Fri Dec 5 17:17:29 PST 2014

> On Dec 5, 2014, at 3:56 PM, Mario Zechner <badlogicgames at gmail.com> wrote:
> 
> This sounds great.  How would you get to know that an instruction will not actually be executed? LLVM's tablegen tables for ARM don't encode that. I guess Process/Thread will require ARM specific code paths?

Yes. We will install a "stop checker" callback on all ARM threads. If that callback is set, we will run it once after we get our stop info back to see if we need to clear out stop info. We have the notion of generic registers for things like the PC, FP, SP, Return address and flags. The flags registers is mapped to the CPSR for all ARM targets. So we can just get the register context from the current thread and read the flags generic register and then check the ITSTATE in the CPSR to verify.
> 
> We currently try to make this somewhat work within ThreadPlans. That seems less invasive and probably enough for our use case. We only have it blocks where all instructions share the same condition.
> 
> Btw, you said traps will always get hit regardless of condition. If we set a 2-byte thumb trap in 3. on MacOSX
> 
> 1. cmp r0, #0x0
> 2. it ne
> 3. 4-byte inst
> 4. ...
> 
> The trap is never triggered. Maybe i'm just doing something wrong though.

We are changing debugserver to use the BKPT instructions for Thumb and ARM and those always stop. For folks that are using 16 _and_ 32 bit traps for Thumb correctly, then our "stop checker" won't get triggered because those won't trigger a breakpoint hit. But we can still run into the problem of stopping when single stepping if you use the "stop when new_pc != curr_pc" method.

I just tested this and it indeed works for my example below:

   0x7cff0 <main+4 >: cmp    r0, #0x0
   0x7cff2 <main+6 >: ittee  gt
   0x7cff4 <main+8 >: movgt  r1, #0x11
-> 0x7cff6 <main+10>: movgt  r2, #0x22
   0x7cff8 <main+12>: movle  r1, #0x33
   0x7cffa <main+14>: movle  r2, #0x44

If we do a "si" or an instruction level step where the PC is at 0x7cff6, we end up at the instruction after 0x7cffa as expected! So it will just not allow for confusing stepping which steps in the else clause when it shouldn't and also it will fix breakpoints on 32 bit opcodes within IT for us on MacOSX for future tools.

> 
> Looking forward to a generic fix for this issue! Thanks for looking into it.
> 
> Mario
> 
> On Dec 6, 2014 12:20 AM, "Greg Clayton" <gclayton at apple.com> wrote:
> After we all recently spoke about thumb IT problems where you could crash when single stepping, we looked into issues we currently have with the IT instruction and we found:
> 
> 1 - breakpoints set on 32 bit Thumb instructions in a valid thumb IT block will cause crashes
> 2 - we need to fix single stepping so it doesn't run into the above issue
> 3 - When single stepping in ARM/Thumb we set the watchpoint registers to say "stop when the PC is not equal to <current-pc>"
> 
> Facebook has fixed #1 and #2 in their GDB server by placing a 32 bit thumb trap when required to avoid changing the size of the instruction in the IT block. This works as long as your kernel support and recognizes a 32 bit thumb trap as a breakpoint. The MacOSX kernel doesn't recognize any 32 bit thumb traps as breakpoints, so we need another solution.
> 
> If you use the 16 bit BKPT instruction for Thumb and the 32 bit BKPT instruction for ARM, these will always get hit regardless of the condition (see the ARM docs for the IT instruction). This is nice in that you can still set your breakpoints correctly and not worry about changing instruction boundaries, but it has the side affect where you can stop a thread at a place where the instruction wouldn't actually get executed. So you can replace the original instructions, single step (and it will ignore it), re-enable the software BP, and continue.
> 
> This "a thread has been stopped on an instruction that won't get executed" causes problems in #3 above when we single step because you could have code like:
> 
> 0x7cff0 <main+4 >: cmp    r0, #0x0
> 0x7cff2 <main+6 >: ittee  gt
> 0x7cff4 <main+8 >: movgt  r1, #0x11
> 0x7cff6 <main+10>: movgt  r2, #0x22
> 0x7cff8 <main+12>: movle  r1, #0x33
> 0x7cffa <main+14>: movle  r2, #0x44
> 
> If we single step through this code we would stop at all instructions 0x7cff4 - 0x7cffa. This is bad because you would step though your code:
> 
> 1 if (argc < 0)
> 2     x = 0x11, y = 0x22;
> 3 else
> 4     x = 0x33, y = 0x44;
> 
> We would stop on line 2 and line 4 which would look really wrong.
> 
> So to correctly account for this we need a notion that a thread is stopped on an instruction which won't get executed. We need to be able to detect this and continue on with our thread plans if no other threads have a valid stop reason.
> 
> Our plan for this is to generically at a high level in lldb_private::Process and lldb_private::Thread add code that can detect this (because this can easily happen with JTAG debuggers, live debuggers (like debugserver and other GDB servers), native debuggers etc. So we really don't want everyone duplicating this code in each of their plug-ins. Once this is detected, we will "do the right thing" and finish the thread plan correctly (like "single step instruction" and "step over source line", etc).
> 
> The other solution would be to do this in each GDB server or native debug implementation that supports ARM, but again, this would mean duplicating a lot of code. If we do this at the lldb_private::Process/Thread level, we can re-use the code and ensure a single path that can be traced and debugged.
> 
> I am just passing along what we plan to do in case anyone has any input or other solutions or ideas.
> 
> Greg Clayton
> 
> 
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev