[Lldb-commits] [PATCH] D86652: [lldb] Fix ProcessEventData::DoOnRemoval to support multiple listeners

Fri Aug 28 02:04:30 PDT 2020

labath added a comment.

This is going to derail this patch somewhat (sorry), but since this is an important long-standing open issue, I think we should discuss it in more detail.

The part that's bothering me about this for quite some time is... Why do we even have this DoOnRemoval stuff in the first place?

I mean instead of doing this work in `DoOnRemoval`, why don't we do this work /before/ enqueueing the relevant event. Then the event would be of purely informative character and it would not matter if its received by one, two, ten, or zero listeners. Of course, if a user wanted to program lldb in a correct and race-free manner, it would /have to/ listen to these events (as otherwise it would not know when is it safe to run some actions), but that would be something that's entirely up to him.

Back when I was debugging some races in tests with asynchronous listeners, I remember it was pretty hard to reason about things and prove they are correct, since the dequeueing of the event (and the resulting flurry of actions within lldb) could come at any moment. If those actions were done internally to lldb, they would be more sequenced more deterministically and be easier to reason about than if they are executed whenever the user chooses to dequeue an event. (I remember things got particularly messy when a test finished before dequeuing an event, at which point the processing of the event was executing concurrently with the `process.Kill()` statement we execute at the end of each test.)

In D86652#2242770 <https://reviews.llvm.org/D86652#2242770>, @ilya-nozhkin wrote:

> One of the possible solutions is to send a specifically marked event to a primary listener first. Then, make this event to invoke a handshake in its destructor (which would mean that primary listener doesn't need the event anymore). And handshake can be implemented as just a rebroadcasting a new instance of the event to other listeners (or via unlocking some additional mutex / notifying some conditional variable). However, destructor may be invoked not immediately after event processing is finished. For example, if a primary listener's thread defines an EventSP before the listening cycle and doesn't reset it until GetEvent returns a new one (which may not happen until the next stop, for example).

This primary listener technique could be implemented purely in your own code, could it not? You could have one listener, which listens to lldb events, and then rebroadcasts it (using arbitrary techniques, not necessarily SBListener) to any number of interested threads that want to act on it. That might make it easier to ensure there are no races in your code, and would avoid stepping on some of the internal lldb races that I am sure we have... Otherwise I (as Jim) expect that you will still need to have some sort of communication between the various listeners listening to process events, as they'd need to coordinate

> Anyway, the concept of primary and secondary listeners requires to forbid secondary listeners to do "step" or "continue" at all. So, why can't we do the same without any additional code? I mean, I think that the most common case of multiple listeners is a Python extension adding a listener to read/write some memory/registers on each stop taking the RunLock.

Yes, you could take the run lock, which would guarantee that the other listeners don't get to resume the process. But how do you guarantee that you _can_ take the run lock  -- some other thread (the primary listener) could beat you to it resume the process, so you'd miss the chance to read/write those registers. Seems very fragile. A stop hook does look like a better way to achieve these things.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86652/new/

https://reviews.llvm.org/D86652