[lldb-dev] race condition using gdb-remote over ssh port forwarding

Tue Nov 28 10:31:13 PST 2017

> On Nov 28, 2017, at 2:25 AM, Pavel Labath via lldb-dev <lldb-dev at lists.llvm.org> wrote:
> 
> On 27 November 2017 at 20:33, Christopher Book via lldb-dev
> <lldb-dev at lists.llvm.org> wrote:
>> Greetings, I've been using liblldb to remotely debug to a linux server with
>> port forwarding.  To do this, I start lldb-server to with --listen
>> specifying a localhost port, as well as with ----min-gdbserver-port and
>> --max-gdbserver-port to specify a specific port for use by 'gdb remote'.
>> Both ports are forwarded to the remote PC, where liblldb connects to
>> localhost.
>> 
>> This generally works fine, but there is a race condition.  When the client
>> tells lldb-server to start gdb-remote, the port is returned to the client
>> which may try to connect before the gdb-remote process is actually
>> listening.
> Are you sure this is what's happening? Looking at lldb-server source
> code, I see that it starts listening on lldb-gdbserver.cpp:320
>      error = acceptor_up->Listen(1);
>      if (error.Fail()) {
>        fprintf(stderr, "failed to listen: %s\n", error.AsCString());
>        exit(1);
>      }
> and the port number is written out only later, on line 334:
>      error = acceptor_up->Listen(1);
>      if (error.Fail()) {
>        fprintf(stderr, "failed to listen: %s\n", error.AsCString());
>        exit(1);
>      }
> 
> After listen returns the connection by your port forwarder should not
> be rejected, and it can only connect once it knows the port to connect
> to.
> 
> 
> That's not to say that connecting through port forwarders is easy. For
> android debugging, we also have to setup port forwarding with adb, and
> we need to do some quite elaborate dances to make sure that the
> connection is reliable and race free (see
> PlatformAndroidRemoteGDBServer::MakeConnectURL) -- however this is to
> resolve a different issue (allocating the port on the local side from
> which to do the forwarding), which does not sound like the problem you
> describe.
> 
> 
>> I would like to submit a patch, but first check to see if this solution
>> would be acceptable:
>> - Include the handshake within the connection retry loop.
>> - This means fully disconnecting the re-establishing the connection in the
>> loop if the handshake fails.
>> - Changing the timeout check to be based on a total absolute time instead of
>> 50 iterations with a 100ms sleep.
>> 
>> Thoughts?
>> 
>> Alternatives could be:
>> - Have lldb-server delay responding to the 'start gdb server' request until
>> it could tell (somehow) that the process is listening.
>> - A sleep of some kind on the client side after starting the server but
>> before trying to connect.
>> 
> 
> The most port-forwarder-friendly solution (and one that I've been
> wanting to implement for a while, but never got around to it) I can
> think of is to not require a second port for the llgs connection. It
> could work approximately like this:
> - we add a new packet type to the lldb-server platform instance: (e.g.
> qExecGDBServer)
> - upon receiving this packet, the platform instance would exec the
> gdb-server instance, and pass it the existing socket file descriptor
> (via a command line argument or whatever)
> - the gdb-server instance would start up with the connection primed
> and ready, and would not need to do any listening and writing back the
> port number, etc.
> - the lldb client would end up with it's "platform" connection being
> connected to llgs, and would have to create a new platform connection
> (or it could create a scratch platform solely for the purpose of
> exec()ing the llgs). Either way, the second connection would be to the
> exact same address and port as the first one, so there should not be
> any extra forwarding setup necessary.

I am not a fan of losing the lldb-server platform connection by turning it into the GDB server connection. It means you need to connect and launch each time instead of connect once to the platform and then launch N times.

> 
> The details of this will need to be thought through (e.g., who should
> send the first packet after the exec()? how to detect an error? should
> the connection start in no-ack mode?), but I don't think it should be
> too complicated overall. And all port-forwarding users would benefit
> from that (we could delete the android-specific code I mentioned, and
> you could stop mucking with --max-gdbserver-port).
> 
> What do you think about that?

Since the port forwarding and the lldb-server need to be in cahoots, maybe an extra argument is added to the "lldb-server platform" invocation that tell us we are using port forwarding and we do something special in the "lldb-server platform" binary to bullet proof the connection?

Greg