[lldb-dev] race condition using gdb-remote over ssh port forwarding

Mon Nov 27 12:33:04 PST 2017

Greetings, I've been using liblldb to remotely debug to a linux server with
port forwarding.  To do this, I start lldb-server to with --listen
specifying a localhost port, as well as with ----min-gdbserver-port and
--max-gdbserver-port to specify a specific port for use by 'gdb remote'.
Both ports are forwarded to the remote PC, where liblldb connects to
localhost.

This generally works fine, but there is a race condition.  When the client
tells lldb-server to start gdb-remote, the port is returned to the client
which may try to connect before the gdb-remote process is actually
listening.  Without port-forwarding, this is okay because the client has
retry logic:

ProcessGDBRemote::ConnectToDebugserver
...
       retry_count++;
        if (retry_count >= max_retry_count)
          break;
        usleep(100000);

But with port-forwarding, the initial connection is always accepted by the
port-forwarder, and only then does it try to establish a connection to the
remote port.  It has no way to not accept the incoming local connection
until it tries the remote end.

lldb has some logic to detect this further in the function, by using a
handshake to ensure the connection is actually made:

  // We always seem to be able to open a connection to a local port
  // so we need to make sure we can then send data to it. If we can't
  // then we aren't actually connected to anything, so try and do the
  // handshake with the remote GDB server and make sure that goes
  // alright.
  if (!m_gdb_comm.HandshakeWithServer(&error)) {
    m_gdb_comm.Disconnect();
    if (error.Success())
      error.SetErrorString("not connected to remote gdb server");
    return error;
  }

But the problem here is that no retry is performed on failure.  The caller
to the 'attach' API also can't retry because the gdb server is terminated
on the error.

I would like to submit a patch, but first check to see if this solution
would be acceptable:
- Include the handshake within the connection retry loop.
- This means fully disconnecting the re-establishing the connection in the
loop if the handshake fails.
- Changing the timeout check to be based on a total absolute time instead
of 50 iterations with a 100ms sleep.

Thoughts?

Alternatives could be:
- Have lldb-server delay responding to the 'start gdb server' request until
it could tell (somehow) that the process is listening.
- A sleep of some kind on the client side after starting the server but
before trying to connect.

Thanks,
Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20171127/6883a320/attachment.html>