[Lldb-commits] [PATCH] D130674: Accurate watchpoint hit counts redux

Wed Jul 27 16:42:18 PDT 2022

jingham created this revision.
jingham added reviewers: labath, clayborg, jasonmolenda, JDevlieghere.
Herald added a subscriber: kristof.beyls.
Herald added a project: All.
jingham requested review of this revision.
Herald added a project: LLDB.
Herald added a subscriber: lldb-commits.

This is a reworking of the previous patch to get the watchpoint hit counts to be accurate on systems where you have to walk the PC past the faulting instruction before reporting the stop.  The previous patch worked reliably on macOS, but failed on the Linux aarch64 bots.  As I suspected from the discussion of the previous patch, the problem was that the threads that don't get to step over the faulting instruction see a new StopInfoWatchpoint on the next stop, and so they schedule a new step-over plan, and things go poorly after that...

I fixed that issue by checking whether the thread had gotten a chance to run in StopInfoWatchpoint::ShouldStopSynchronous and not doing any work in that case.  In the answer to Pavel's question, I managed to get both modes to work but it is much easier for lldb to deal with debugserver's clearing of stop info's on each step, than lldb-server's preserving them.  The former is easy for lldb to reason about, since it is in control of what stop-info's it decides to preserve, the latter requires a little guessing...

However, even with that fix, the "threads/concurrent_events" Linux test runs that had several watchpoints and a breakpoint or signal were flaky.  The reason for that is that depending on how events arrived, a signal or breakpoint event could force a stop before the threads stopped at watchpoints all had a chance to move past their faulting instructions.  Then on Linux, we would get all the unprocessed watchpoint hits reported anew when the breakpoint stopped so that it looked like the watchpoint threads had hit the watchpoints too many times.

I fixed that by adding an explicit mechanism for a thread & it's thread plans to say "I have work that needs to be done before a public stop can be declared".  That way all the threads that stop on the watchpoint can force their step-instructions before the final stop is reported, and we won't end up with some having run the faulting instruction and others not, causing us to overcount the hits.  That's the ShouldRunBeforePublicStop machinery.

The only other substantial change over the previous patch is that there were a handful of places where we weren't being rigorous about our rule that "LLDB preserves the StopInfo for thread that doesn't get to run".  In one place, we were using GetResumeState not GetTemporaryResumeState to check whether the thread had run.  GetResumeState only returns true if the User suspended the thread, not if lldb has suspended it for this one resume for its own purposes, so it wasn't the right test.  And in another place, if we get thread stop info in the JSON form, we would unconditionally reset all the threads w/o checking if any of them had run or not.

With this I'm getting clean test suite runs on macOS and on a Linux guest running Ubuntu22.04 on my AS Mac.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D130674

Files:
  lldb/include/lldb/Breakpoint/Watchpoint.h
  lldb/include/lldb/Target/StopInfo.h
  lldb/include/lldb/Target/Thread.h
  lldb/include/lldb/Target/ThreadPlan.h
  lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
  lldb/source/Target/StopInfo.cpp
  lldb/source/Target/Thread.cpp
  lldb/source/Target/ThreadList.cpp
  lldb/test/API/commands/watchpoints/watchpoint_commands/condition/TestWatchpointConditionCmd.py
  lldb/test/API/functionalities/thread/concurrent_events/TestConcurrentDelayWatchBreak.py
  lldb/test/API/functionalities/thread/concurrent_events/TestConcurrentManyWatchpoints.py
  lldb/test/API/functionalities/thread/concurrent_events/TestConcurrentNWatchNBreak.py
  lldb/test/API/functionalities/thread/concurrent_events/TestConcurrentSignalNWatchNBreak.py
  lldb/test/API/functionalities/thread/concurrent_events/TestConcurrentSignalWatch.py
  lldb/test/API/functionalities/thread/concurrent_events/TestConcurrentSignalWatchBreak.py
  lldb/test/API/functionalities/thread/concurrent_events/TestConcurrentTwoWatchpointThreads.py
  lldb/test/API/functionalities/thread/concurrent_events/TestConcurrentTwoWatchpointsOneBreakpoint.py
  lldb/test/API/functionalities/thread/concurrent_events/TestConcurrentTwoWatchpointsOneDelayBreakpoint.py
  lldb/test/API/functionalities/thread/concurrent_events/TestConcurrentTwoWatchpointsOneSignal.py
  lldb/test/API/functionalities/thread/concurrent_events/TestConcurrentWatchBreak.py

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D130674.448204.patch
Type: text/x-patch
Size: 36976 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/lldb-commits/attachments/20220727/0e541c72/attachment-0001.bin>