[Lldb-commits] [lldb] [lldb] Change lldb's breakpoint handling behavior (PR #96260)

Wed Aug 7 17:20:02 PDT 2024

jasonmolenda wrote:

>   // Mark the thread as stopped at breakpoint.
>   thread.SetStoppedByBreakpoint();
>   FixupBreakpointPCAsNeeded(thread);
> 
>   if (m_threads_stepping_with_breakpoint.find(thread.GetID()) !=
>       m_threads_stepping_with_breakpoint.end())
>     thread.SetStoppedByTrace();
> 
>   StopRunningThreads(thread.GetID());
> }
> ```
> 
> I think this is the correct thing to do, since lldb is completely unaware of whether software single stepping is taking place(*) I got lost in all of the mach exceptions, so it's not clear to me what is the conclusion. Are you saying that debugserver does not do this (i.e., it reports the stop as a "breakpoint hit", in the mach exception form)?

Ah, very nice, thanks for finding this.  I think this helps to show why this is failing with the new "breakpoints are only 'hit' when they are actually executed'" scheme.  

If the pc is at a BreakpointSite and we're stopped in the debugger.  We do "si", a breakpoint is inserted on the next instruction (pc+4 or 2), this thread and pc+4 is added to `m_threads_stepping_with_breakpoint`.  We resume execution.  The breakpoint at pc is hit, and the pc value does not advance.  This method is only checking that the thread is included in the `m_threads_stepping_with_breakpoint` map, not checking that the pc of the breakpoint we hit is the one we inserted (at pc+4/2).  I wonder if this can be fixed by

```
NativeRegisterContextLinux &reg_ctx = thread.GetRegisterContext();
if (m_threads_stepping_with_breakpoint.find(thread.GetID()) !=
       m_threads_stepping_with_breakpoint.end() &&
    m_threads_stepping_with_breakpoint.find(thread.GetID())->second == reg_ctx.GetPC())
```

This wasn't necessary with the previous behavior -- if we stopped at a BreakpointSite, we would declare that we hit the breakpoint, even if we'd just stepped up to the BreakpointSite.  But now that you may have stepped up to a BreakpointSite OR you may have executed & hit the breakpoint, we're trusting the gdb-remote stub's response more.

I think @DavidSpickett was seeing some looping in the test case, and I could imagine a scenario where the pc is at a BreakpointSite, lldb instruction steps, and we get back an instruction-step stop reason without the pc advancing.  We haven't hit the breakpoint yet (as far as lldb knows), so we try again.  And again.

> Just as a curiosity the linux kernel does allow you do use the hardware single step capability if debugging a 32 bit application on a 64-bit kernel (I'm not sure if the debugger process needs to be 64-bit or not). At one point, I wanted to make use of that, which would make our stepping behavior more reliable in these situations. However it would also make it harder to reproduce situations like these, as one would have to find actual 32-bit hardware...

Yeah on an armv8 system that can run AArch32 or AArch64 code, both will have MDSCR_EL1.SS instruction stepping capability.  It's only on a genuine armv7 processor where that feature doesn't exist.  I don't know how much lldb distinguishes between AArch32 and armv7, but this is one important difference.  (lldb AArch64 cores don't support AArch32 any more, so I haven't worked with this combination in years).

> (*) The code is probably too casual about this check though, in that it does not check whether it was the actual single-stepping breakpoint that got hit. If there was a breakpoint under the current PC, it would report this as "trace" even though it hasn't actually stepped anything. Could this maybe be the problem that you ran into with this patch?

Ah yes, exactly so, we reached the same conclusion, I didn't read your whole response at first. :) 

https://github.com/llvm/llvm-project/pull/96260