[lldb-dev] gdb-remote lldb prompt race condition

Thu Aug 21 10:24:47 PDT 2014

> On Aug 21, 2014, at 12:27 AM, Matthew Gardiner <mg11 at csr.com> wrote:
> 
> Hi Greg,
> 
> If that is the case, then the main thread (i.e. the one running Debugger::ExecuteIOHandler) needs to be blocked until the event arrives.

The command line command "process connect" needs to synchronize better. It currently isn't and this is causing the problem. All we know is that we are executing a "gdb-remote 1234" command which turns into "process connect connect://localhost:1234" and then the command execution returns immediately and then the io handler gets the next command as it should. We should add code to commands that we know might needs some syncing to make sure they don't violate things.

> Why?
> 
> Because with the existing design once the user has entered their "gdb-remote" command, and completed the connect call,  the main thread goes back to the IOHandlerEditline::Run() loop, sees that IsActive() is true, prints the next prompt, etc.. When I debugged this I didn't see any call to Deactivate or SetIsDone, which would have made IsActive return false. (And then the async DefaultEventHandler awakes and it's output "Process 1 stopped" splats over the prompt).

No other IOHandler gets pushed, so the command interpreter is active and it will just get the next command as soon as it can. Most commands do something, then complete and then we are ready for more. There are a few that needs some extra syncing because they cause a launch/attach/connect and we usually get a response right away. There are others that might be immediate and might be async "thread step-XXX" for example. The step might complete and it might not. There are others that are purely async "process continue".

To complicate this, if we attach to a process, then there is no process IO handler to push and the command line will always be active.

> 
> If the code is changed so that the edit line IOHandler's IsActive returns false, while an asynchronous event is happening, then I think that the main thread would spin, since the reader_sp->Run() function below:
> 
> void
> Debugger::ExecuteIOHanders()
> {
>    while (1)
>    {
>        IOHandlerSP reader_sp(m_input_reader_stack.Top());
>        if (!reader_sp)
>            break;
> 
>        reader_sp->Activate();
>        reader_sp->Run();
>        reader_sp->Deactivate();
> 
> would immediately return. That's why I'm thinking the main thread probably should block until the last issued command has completed.

So this is the job of each command. Most commands complete right way.

It sounds like we might want to introduce a formal synchronization to the IOHandler and each command would need to change it from the default "no sync required". 

We would need two things:
- no sync required (default)
- sync with timeout for commands that usually complete quickly, but if they don't we need to get back to commands

> 
> Out of interest, I did research your "If someone is grabbing the event manually by hijacking events" point. But when stopped state is detected (i.e. the reply to ?) in GDBRemote and Broadcaster::PrivateBroadcastEvent is called, there is no hijacking_listener. Indeed the execution path is that as indicated by the -->
> 
>    if (hijacking_listener)
>    {
>        if (unique && hijacking_listener->PeekAtNextEventForBroadcasterWithType (this, event_type))
>            return;
>        hijacking_listener->AddEvent (event_sp);
>    }
>    else
>    {
>        collection::iterator pos, end = m_listeners.end();
>        // Iterate through all listener/mask pairs
>        for (pos = m_listeners.begin(); pos != end; ++pos)
>        {
>            // If the listener's mask matches any bits that we just set, then
>            // put the new event on its event queue.
>            if (event_type & pos->second)
>            {
>                if (unique && pos->first->PeekAtNextEventForBroadcasterWithType (this, event_type))
>                    continue;
> ---->         pos->first->AddEvent (event_sp);
> 
> So my contention is that in the case of gdb-connect the initial stop state should either be delivered/printed sychronously in the Process::ConnectRemote (i.e. in the mainthread context) or that the main thread should block until either the event arrives, or for some other reason the command last issued by the user is deemed to be "complete".
> 

Agreed. "process connect" should be doing better synchronization.

We also need a better way to deliver async output to the IOHandler stack. Right the only place that tries to handle this is in the Debugger class where it handles process events. This is where the tricky code of popping the process IOHandler lives. What should happen is we should have a function like:

Debugger::AsyncIO (...)

and any async output (from the process' stdout/stderr, or an eStateStopped event with thread state and frame and reason for stop) should be delivered through this.

The top IOHandler would then get this delivered to it in a thread safe way and it could then hide itself (for the command interpreter remove the "(lldb) " prompt and any command you are typing, display the output, then refresh the prompt and command, and continue. I believe this will solve all of our issues (using Debugger::AyncIO(...)).

Greg