[Lldb-commits] [PATCH] D83815: [lldb/Test] Use a process group for subprocesses

Pavel Labath via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Wed Jul 15 01:52:28 PDT 2020


labath added a comment.

hmm... I have a lot thoughts here..

- `setsid` is overkill. If you want to create process group, create a process group (`setpgid`), not a session.
- this solution does not seem very windows-friendly. In fact, I'd be surprised if it works there at all.
- going for `os.kill` will also not work with remote test suite runs. In fact, the way it is implemented now means we could start killing random processes on the host.
- are these really zombies (dead but not waited-for processes) or just live processes stuck in infinite loops and similar? Because if they are zombies (mmm... brainz), then I doubt sending signals to them will help -- they're already dead. The problem is that someone is not reaping them. I'm not sure what can be done about that, as normally init should take care of this...
- even if they are live processes, I am doubtful that this will make a difference. I don't think many of the processes we spawn create additional subprocesses. In fact, the only one I know of is `TestChangeProcessGroup.py`, which also creates a new process group... so this test would need special handling anyway.
- creating lots of process groups can actually cause us to leak _more_ processes in case the test run is interrupted with ^C or similar (SIGINT goes to the foreground process group)

Overall, I think we should try to understand the problem a bit better before we start shotgun debugging this. Which tests are creating these zombies? What's special about them? Do they always do that or only occasionally? etc.

PS: When we were running a mac bot (it seems only macs are vulnerable to zombie infestations), we configured it to reboot itself each midnight. Not the most elegant solution, but if the problem really is about init refusing to reap zombies, then it may be the only thing that we can do...



================
Comment at: lldb/packages/Python/lldbsuite/test/lldbtest.py:894
+            try:
+                os.kill(os.getpgid(p.pid), signal.SIGTERM)
+            except OSError:
----------------
Right this is pretty much equivalent to what we've had before. I guess you wanted to say `os.killpg(os.getpgid(pid), SIGTERM)`. `os.kill(-os.getpgid(pid))` might also work, but the first one seems more explicit.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83815/new/

https://reviews.llvm.org/D83815





More information about the lldb-commits mailing list