Re: [lldb-dev] Failure connecting to lldb-server running in local virtual machine.
Greg Clayton via lldb-dev
lldb-dev at lists.llvm.org
Thu Feb 16 09:17:41 PST 2017
> On Feb 16, 2017, at 5:58 AM, Howard Hellyer via lldb-dev <lldb-dev at lists.llvm.org> wrote:
> I’ve been hitting issues connecting to lldb-server. I’ve been trying from Mac to Linux running in a Virtual Box VM on the same machine. Once I’ve created a target and issued the “run” command lldb immediately disconnects with “error: connect remote failed (failed to get reply to handshake packet)”. The full output from a failed connection attempting to debug a simple "Hello World!" program is:
> (lldb) platform select remote-linux
> Platform: remote-linux
> Connected: no
> (lldb) platform connect connect://127.0.0.1:1234
> Platform: remote-linux
> Triple: x86_64-pc-linux-gnu
> OS Version: 4.8.0 (4.8.0-22-generic)
> Kernel: #24-Ubuntu SMP Sat Oct 8 09:15:00 UTC 2016
> Hostname: hhellyer-VirtualBox
> Connected: yes
> WorkingDir: /home/hhellyer
> (lldb) target create hello.out
> Current executable set to 'hello.out' (x86_64).
> (lldb) run
> error: connect remote failed (failed to get reply to handshake packet)
> error: process launch failed: failed to get reply to handshake packet
> I’m running the server (on Linux) with:
> >lldb-server platform --listen *:1234 -P 2345
> (I need to specify the -P as only a few ports are forwarded from the VirtualBox vm.)
> With logging enabled the logs showed the failure happened when the lldb-server received the "QStartNoAckMode" packet.
this is just the first packet we send after sending the "ack" down to the remote server. When we send the "ack", we don't need a response, then we send the "QStartNoAckMode" packet and actually wait for a response. If we don't get one, then we bail. So this is just the first packet that is sent that expects a response.
> I initially thought this was a timing issue on the connection between the client and the server. After doing some investigation I ended up adding code to dump a backtrace when the connection was disconnected (in Communication::Disconnect) and suddenly running the target started working. I replaced the backtrace with a sleep(1) call and it continued working. After that I setup another remote virtual linux box (actually some distance away on the network) and found that lldb worked fine connecting to the remote lldb-server there, presumably because the connection was much slower.
We seem to have a race condition here. Any help figuring out what that might be would be great.
> At this point I was assuming it was a timing issue however another configuration that worked was lldb-server and lldb both running on the Linux VM inside Virtual Box which I would have assumed would also be very quick. I’m also wondering if lldb does anything special when the connection goes to 127.0.0.1 that using a local VM might confuse.
So a little background on what happens:
- You launch lldb-server in platform mode on the remote host (VM in your case)
- We attach to it from LLDB on your host machine like you did above
- When we want to debug something on the other side ("run" command above), we send a packet to the lldb-server on the remote host and ask it to launch a GDB server for us. The lldb-server does launch one and sends back a hostname (<hostname>) and a port (<port>) in the response. The host side LLDB then tries to connect to this IP address and port using the equivalent of:
(lldb) process connect connect://<hostname>:<port>
So the question is: does your VM have a unique IP address? if so, no port mapping will need to be done. If it responds with 127.0.0.1, then we have port collision issues and will probably need to use a port offsets. Typically with VMs there is some port remapping that has to happen. If you and your VM are both listening for a connection on port 1234, what happens if you would do:
(lldb) platform connect connect://127.0.0.1:1234
Would it connect to your host machine, or the VM? Typically there are some port remapping settings that you do. Like "add 10000" to any ports for the VM, so you would then just do:
(lldb) platform connect connect://127.0.0.1:11234
Notice the extra 10000 added to the original 1234.
When starting the lldb-server in platform mode, you can ask it to use a min and max port range for connections so that you can edit your VM settings so that certain ports are available:
% lldb-server platform --listen '*:1000' --min-gdbserver-port 1001 --max-gdbserver-port 1010 --port-offset=10000
This would listen on port 1000, and reserver 1001 - 1010 for GDB remote connections when the server is asked to spawn a lldb-server in debug mode. Since you specified your port offset of 10000, when a packet is sent to your "lldb-server platform" that asks it to start a lldb-server for debugging it will use a port between 1001 and 1010 and when it sends the info back to the client it will add the port offset, so it will send back "11001" as the first port to connect to and the port remapping will know to talk to your VM.
> I was testing from Mac to Virtual Box just because it was simpler than testing to a remote system, which was actually the goal and seems to work, so this isn’t totally blocking me but it does seem like a problem and I'm not sure if other users connecting to remote linux machine will hit this problem.
Sounds like a race condition we need to solve.
> Are there any known issues around this type of connection already? Or does anyone have any useful pointers? I couldn’t see anything quite the same on http://bugs.llvm.org/ <http://bugs.llvm.org/> so asking here seemed like a logical next step.
This sounds like a very common VM issue. If you can elaborate on what happens with ports on your VM we might be able to deduce some of the problems. Filing a bug will be nice, but since you are running into this issue, seems like you will need to debug it and make it work. We are happy to assist.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the lldb-dev