<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/97537>97537</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
lldb-server's GDB port map has a race condition when killing the debugee on a slow remote system
</td>
</tr>
<tr>
<th>Labels</th>
<td>
lldb
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
DavidSpickett
</td>
</tr>
</table>
<pre>
To reproduce, the remote needs to be something relatively slow. In my case QEMU.
Start an lldb-server on the remote in platform mode with some restricted ports:
```
$ ./lldb-server plaform --server --listen 0.0.0.0:54321 --min-gdbserver-port 49140 --max-gdbserver-port 49150
```
Then connect lldb and run a program:
```
$ ./bin/lldb
(lldb) platform select remote-linux
Platform: remote-linux
Connected: no
(lldb) platform connect connect://127.0.0.1:54321
Platform: remote-linux
Triple: aarch64-unknown-linux-gnu
OS Version: 6.10.0 (6.10.0-rc4-g14d7c92f8df9)
Hostname: e125016
Connected: yes
WorkingDir: /home/davspi01
Kernel: #1 SMP PREEMPT Tue Jun 18 15:14:36 BST 2024
(lldb) target create /tmp/test.o
Current executable set to '/tmp/test.o' (aarch64).
(lldb) b main
Breakpoint 1: where = test.o`main at test.c:1:21, address = 0x0000000000400574
(lldb) run
Process 263 launched: '/tmp/test.o' (aarch64)
Process 263 stopped
* thread #1, name = 'test.o', stop reason = breakpoint 1.1
frame #0: 0x0000000000400574 test.o`main at test.c:1:21
-> 1 int main() { return 0; }
```
At this point lldb-server has started a new process to handle this client's connection and given it a port map that has one open port, that it has used to start a gdb-server process.
>From here, if you finish the program then run it again, everything is fine. The gdbserver is torn down and the port is freed, then reused for the new gdbserver.
However, if you run before the program finishes there is a race between the platform killing the gdbserver process and the platform handling the launch gdb server request packet:
```
$ ./bin/lldb
(lldb) platform select remote-linux
Platform: remote-linux
Connected: no
(lldb) platform connect connect://127.0.0.1:54321
Platform: remote-linux
Triple: aarch64-unknown-linux-gnu
OS Version: 6.10.0 (6.10.0-rc4-g14d7c92f8df9)
Hostname: e125016
Connected: yes
WorkingDir: /home/davspi01
Kernel: #1 SMP PREEMPT Tue Jun 18 15:14:36 BST 2024
(lldb) target create /tmp/test.o
Current executable set to '/tmp/test.o' (aarch64).
(lldb) b main
Breakpoint 1: where = test.o`main at test.c:1:21, address = 0x0000000000400574
(lldb) run
Process 263 launched: '/tmp/test.o' (aarch64)
Process 263 stopped
* thread #1, name = 'test.o', stop reason = breakpoint 1.1
frame #0: 0x0000000000400574 test.o`main at test.c:1:21
-> 1 int main() { return 0; }
(lldb) run
There is a running process, kill it and restart?: [Y/n] y
Process 263 exited with status = 9 (0x00000009) killed
error: unable to launch a GDB server on 'e125016'
```
We can see that the packets are sent in the right order:
```
lldb < 5> send packet: $k#6b
lldb < 19> read packet: $X09;process:115#cc
Process 277 exited with status = 9 (0x00000009) killed
<...>
lldb < 34> send packet: $qLaunchGDBServer;host:e125016;#12
lldb < 7> read packet: $E09#ae
error: unable to launch a GDB server on 'e125016'
```
On the server side it uses `kill()` to kill the process and it should then free the port here:
https://github.com/llvm/llvm-project/blob/76c84e702bd9af7db2bb9373ba6de0508f1e57a9/lldb/tools/lldb-server/lldb-platform.cpp#L300
(introduced by https://github.com/llvm/llvm-project/pull/88845, which I think made the feature overall better, but in doing so exposed this issue)
If the remote is slow enough that the kill doesn't actually finish before we get back there, then it won't and it'll try to find a free port, find none, and fail to launch the gdb server. Some ad-hoc logging shows this on the server side:
```
265: port map begins -------------------- <<< 256 is the main process
first: 49140 second: 0
first: 49141 second: 0
first: 49142 second: 0
first: 49143 second: 0
first: 49144 second: 0
first: 49145 second: 0
first: 49146 second: 0
first: 49147 second: 0
first: 49148 second: 0
first: 49149 second: 0
265: port map ends--------------------
266: port map begins -------------------- <<< 266 is created to handle the connection to lldb
first: 49140 second: 0
266: port map ends-------------------- <<< the first gdbserver is launched on the only unallocated port, 49140
266: port map begins --------------------
first: 49140 second: 271
266: port map ends-------------------- <<< looks for a free port, but the 1 port is allocated to pid 271 still
266: No free port found in map! <<< error returned to lldb
266: FreePortForProcess 271 49140 <<< it finally notices that kill() has happened
```
This does not happen debugging locally on real hardware.
I discovered this running some of the SVE tests again. They run the debugee once to discover the supported vector lengths and then again to run the actual test case. Adding a `sleep(5)` in the test cases between those 2 runs also "fixes" the issue.
The workaround is to not use a port map at all, but often giving full network access to a VM is difficult. So I'd like to find a way to make this work, or at least work around it in the tests most likely to be run this way.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWc1y47gRfhr40iUVCUqieNDBHlu7k-xkJ7GzmxxBoikiogAuAFrW26caIPUz1qxmUjmO7JItAuh_fP0BEs6pjUZcsfkDmz_eid43xq4exauSz52qtuj9XWnkYfViwGJnjewrZPwD-AbB4s54BI0oHXgDJYIzO_SN0huw2AqvXrE9gGvNfgofNewOUAmH8PenT_-csuSRJffx_dkL60FoaFtZThzaV7Rg9LkWpaFrha-N3cHOSIS98k3QBxadt6ryKKEz1juWDWLZIhl-40c-gynj63MlXSuCyMn4YDJplfOoIZmGH5bdz2cZT2Ey2Sk92cgyTpyQKpgV6SyhIfF2ZWieXLXjpUENldEaKx88BqEl2F6DgM6ajRW7Wy6USg-OjM-X4QMvTlFy2JKCGL9Jq3T_FicDfB7msOz-2viHaBtKGtfm6ypGJ4a_ZDVfM75OeR6il47R-0bFAPBiVdcijQthq2Yxm_R6q81ex4mTje7j5F-f4Te0ThlNkxfTNJkmwPgy_jex1WyySWcyrwpeL2VdMF6MWn42zmuxC1ow5fMkXVzz_IAuPv7d2K3Sm0dl6THj68bskPG1FK-uU0l6sv6vaDW2cVaWwvOnz_D5H09Pnz6_wEuP8JdeQ7qEdM6y-3TGsvtsAQ_PL8ATPnsXZS_sBj1UFoVH0up3Hb2j89MhJx96a1F7wDesei_KFsGhp73IeP7lCp5TeIaoMl5M32ksYSeUjo8fLIptZ5T2QFmEfYMWgWWPMIhbJDQZhI8PKnKJZfc8JXgQUlp0LsxP3pLja5Yk8_y9q7YftH62pqJ1fJFBK3pdNTEV3-DOewHOm65DOWq7B99YFDKkhoykEggWMp4fhdIALQSLwhkdxsuzWEzHbNc2LOcZQcQVJ2_HKciZsOwJUgBgSUEKQgb4kqLC8gew6HurIWHZA7D88Soo3HvwjXIQLTxHt0Y4cIStKEGAxj3hS4iPN9AILVuMS6tWofaM527cy8roAEsb9YoalCdsIljbiQ58I3yQbTSC6VCHodgWhKfJNNg7lKTHRXCHzRnqRisuesDamh1QkZEcVcPB9FArrVwT2sAAjPS_DlhJJm1CsD4AvqI9xL6jHK3CKbw0CEdIpsfeWA3S7KNfQSY5RAssohy6mgaLwfLa2DCJonaUc2Hxz2ZPis_sJbtKrI3FC5ujG-jooUXSKMCKCqFEv0eMfe4IqlvVtuSJv3BgTNzR9nF6SOM4P-4ZWgbDOot_9Og8dIK6-Y--8qOv_OgrP_rKV_rKlci9nEFWrzUBzQBF5BFhVUBiIrAYkJ5l6xDZ-cO_GV9rNn-Ew_sY4puiphRJvBe-jzktKPZHn2lvBRVjsNFaE_ZKr0NRejMinoCfHh_gdHJgPB93IM-vQt7vCJXQ4BBj1wqgGkDSgbBU8NrTqSOcQdSm8WCsRPs1BA08nmUfAGBOoXeo5Ql0gfHZlvFsUZ7NPn-FlWlBK0MtXaz8V1Kw7GEMe3afpnPGs6r6Iqx5_j-GlWUfptMpy57-zLhsdt2tP34JGfjp8eE5RJ9lD41xNDomIHugncH_TDjk1z1_ImMzgf__9P8aMzsscUoi1XHv0AFbJBSduFvYIiE9odCHnn7sw8qDa0zfykgciEaceEWgMmOxNN537tjGNso3fTmtzC4029fxz6Sz5j_U7_i6bE3J-DpfVMsZ5gkvZSHqXJa8LIssz0qxkJjMk2Wd4jwXxdi0-dob07rLQ-74aWyv06rrGM9-yZLknM4wvlTaxyO-hPIA321011PU1svlcjYndNg3qmrgI1FMvYWdkDE8NQrfWwTzila0LbEgH2lU2YctJw3BjDOAb50JLJI4qnKuxyMgx_eP9cUdgQt3DYDa9JvmtK1D8qRBpwmRQVS-F217GOnlwNj2CNQXS1FtI1E7MkLlYW-GtSHtjOdUDvZApVErTew6ZH_kweGZNjrIoDW1UO1ZvQ7cbii_KTybHYKQk8ZU0JrNJvjfmL2Lrpt3xfo1GOIL4gAnql7iRmkHkyuvuPfiL_D5InDkBkOnOGJ8EForGzb0cN3hsDI6tM_kynh6Y5zfGM9ujM9ujM9vjC9ujOc3xpc3xov3418mBbV01zIyzl58cwrPErgICYysTl6c7_D8TEc1eKT3txL7pSlfM_yilMIWJ8GXx6-Rdo3FbHR7ICBvW1OJ8e6Otkuw5btDccMhnqff6dLJodaYrQvnwS92OcEVuZIeT5Inb7yBTknSC85TNznX_jdzEgS16QlUNJnDeHoeytDwBq4WZZ5yN4haW8TPxvq1sScikA7-nyQpT5gUUE8br6pwGBUeTo0unNkb0XWoj7zgy3tL5QKMkohhKkgs-4hX5DnJN3SEFi00wsq9sHhxZv4IUrmKkH-E9ZFVhqtcE_H8-benwGpdPOKHw_whnK5pNKhEqqAqcIBRYsTIvqOoooRXrLyx0KLe-OZ4cNZRJK0b5cV-EBSG--kp3EtJJgmiAq5F7BhfzgcyMFDC42x3doo3DoGTXKoER6clXqs3dIzzsCi0sIt4vDQIe2O3wsYqCDczFN7e4fmVi_BUW2PRmdqjho16JSvrvm1BoycxIKrxekfAb59InlR1raq-9dRl4CPjuYRWbfGsde1FaGQ7sR2ug0gU6aKS99CicB6i-MFMfx4GBzvjfBDaHoYvAmJsSZQ4TO_kKpNFVog7XKV5WhTFbJmnd82qLmQtl_VsNp_LrK4yvkjzrC4EJqkUVcrv1IqOsEmeZMmSp2k6zXKJgteY58tikS7mbJbgTqh2SlRkauzmLsR4VeTzLL9rRYmtC99ucB4JEmfzxzu7Csyl7DeOzZJWOe9OArzyLa4u-FPuAr88ZoO2ynCFQwCjArLuqbjOL29OdQoiEpOBp7iD87i76227-m6OFdwjghc9fF3x_wYAAP__VBLeIA">