[Lldb-commits] [lldb] [lldb] Updated lldb-server to spawn the child process and share socket on Windows (PR #101283)

Wed Jul 31 06:09:22 PDT 2024

labath wrote:

> @labath
> 
> > For consistency, and to minimize the number of ifdefs, I believe windows and non-windows paths should use as similar approaches as possible. That means I think the end state should be that the posix path also uses a fork+exec approach.
> 
> fork+exec on non-Windows OS is redundant. It simply does not make sence. We can use multithreading or CreateProcess on Windows. It seems [the multithreading solution](https://github.com/llvm/llvm-project/pull/100670) faced a lot of issues and system's bugs on Linux. So CreateProcess is the only way. But it is the fork() workaround for Windows.

This is where I'll have to disagree with you. The exec following the fork is not "redundant" on linux. A naked fork is an extremely dangerous thing in todays (multithreaded) applications. I'm absolutely certain that the improper use of fork with threads (and the FD leaks that go along with it) is the cause of at least one of the issues (ETXTBSY) you refer to as "system bugs", and fairly sure it's also the cause of the other one (the pipe workaround) as well. In a multithreaded application (which lldb-platform isn't right now, but more of an accident than design, and which you've tried to make extremely multithreaded) the only safe use of fork is if it is immediately followed by an exec, and this has to be done in an extremely careful way (we try to do that, and we still failed).

But even if it weren't for all of this, I'd still think it makes sense to start a new process, just for symmetry with windows. That's the nature of portable programming.

> 
> > For me, it also seems natural that this should be the first step. While doing that, we can make sure to create abstractions that can be used to implement the windows behavior as well. I'm imagining some sort of a function or a class, which takes the target binary, its arguments, and a socket, and launches that binary with that socket. (we'll probably also need a similar function/class on the other end to receive that socket).
> 
> Look at GDBRemoteCommunication::StartDebugserverProcess(). I planed to extend it to use `pass_comm_fd` of Windows in the part 2. It is hard to create abstractions because of a lot of parameters. We can move most of SpawnProcessChild() code to `ConnectionFileDescriptor::ConnectFD` in ConnectionFileDescriptorPosix.cpp BTW, I don't like #ifdef _WIN32 in *Posix.cpp files. But currently the common ConnectionFileDescriptor.h contains just `#include "lldb/Host/posix/ConnectionFileDescriptorPosix.h"`

I didn't say it was going to be easy. :) I can look at this closer when I get back next week, but I definitely think we can and should come up with a solution involving less ifdefs.

> 
> > I'm not sure if we really need the WriteWithTimeout functionality. The reason we don't have it so far is that pipes have a buffer, and unless we're writing a huge chunk of data, the write will never block anyway. There's nothing wrong with it in principle though, and if you do want to have it, it'd be best if it comes with in its own patch and with a test.)
> 
> Before this patch the pipe read/write 1 byte in ConnectionFileDescriptor::CommandPipe. The maximal size was 6 bytes (the port value) in GDBRemoteCommunication::StartDebugserverProcess() and it is only the place where ReadWithTimeout() is used. But WSAPROTOCOL_INFO is big enough (628 bytes). Usually the pipe buffer is 4K. 

628 is still less that 4k. Can the buffer be smaller than that?

> But we need an ability to avoid hangs anyway. I agree that all pipe related changes may be moved to another patch. And we need to add WriteWithTimeout() to PipePosix too. Note there is no any test for Pipe now. And I have no idea how to test it, especially async version. Running `lldb-server platform --server` on Windows is the best test. It must be tested with a high load with multiple connections. Otherwise any pipe tests are useles.

We have one and a half pipe tests (one of them claims to be broken on windows). I wrote them when fixing some Pipe bugs. We should definitely have more.

I wholeheartedly disagree with the assertion that the test is useless unless it uses multiple threads and high load. Even the most simple test serves as documents the correct/expected API usage and proves that the functionality works at least in the simple cases. For something like `WriteWithTimeout` I'd probably create single threaded test which does something like this:
- write to the pipe until it is full
- try to write with a larger timeout and check that it actually times out
- read from the pipe and check that you got back what was written

That will cover most of the codepaths and obvious ways it can go wrong (e.g. during refactoring). Then, if you wanted to be fancy, you could do things like start a write with a long timeout and then try to read from the pipe, and see that causes the write to complete, etc.; but I wouldn't expect everyone to write those all the time.

https://github.com/llvm/llvm-project/pull/101283