[Lldb-commits] [lldb] [lldb/linux] Make sure the process continues running after a detach (PR #88494)
Pavel Labath via lldb-commits
lldb-commits at lists.llvm.org
Fri Apr 12 05:40:56 PDT 2024
https://github.com/labath updated https://github.com/llvm/llvm-project/pull/88494
>From c6b2c5e58321e72155b8c45cd3591487c1cafacf Mon Sep 17 00:00:00 2001
From: Pavel Labath <pavel at labath.sk>
Date: Sat, 6 Apr 2024 17:51:12 +0000
Subject: [PATCH] [lldb/linux] Make sure the process continues running after a
detach
Whenever an inferior thread stops, lldb-server sends a SIGSTOP to all
other threads in the process to force them to stop as well. If those
threads stop on their own before they get a signal, this SIGSTOP will
remain pending and be delivered the next time the process resumes.
Normally, this is not a problem, because lldb-server will detect this
stale SIGSTOP and resume the process. However, if we detach from the
process while it has these SIGSTOPs pending, they will get immediately
delivered, and the process will remain stopped (most likely forever).
This patch fixes that by sending a SIGCONT just before detaching from
the process. This signal cancels out any pending SIGSTOPs, and ensures
it is able to run after we detach. It does have one somewhat unfortunate
side-effect that in that the process's SIGCONT handler (if it has one)
will get executed spuriously (from the process's POV).
This could be _sometimes_ avoided by tracking which threads got send a
SIGSTOP, and whether those threads stopped due to it. From what I could
tell by observing its behavior, this is what gdb does. I have not tried
to replicate that behavior here because it adds a nontrivial amount of
complexity and the result is still uncertain -- we still need to send a
SIGCONT (and execute the handler) when any thread stops for some other
reason (and leaves our SIGSTOP hanging). Furthermore, since SIGSTOPs
don't stack, it's also possible that our SIGSTOP/SIGCONT combination
will cancel a genuine SIGSTOP being sent to the debugger application (by
someone else), and there is nothing we can do about that. For this
reason I think it's simplest and most predictible to just always send a
SIGCONT when detaching, but if it turns out this is breaking something,
we can consider implementing something more elaborate.
One alternative I did try is to use PTRACE_INTERRUPT to suspend the
threads instead of a SIGSTOP. PTRACE_INTERUPT requires using
PTRACE_SEIZE to attach to the process, which also made this solution
somewhat complicated, but the main problem with that approach is that
PTRACE_INTERRUPT is not considered to be a signal-delivery-stop, which
means it's not possible to resume it while injecting another signal to
the inferior (which some of our tests expect to be able to do). This
limitation could be worked around by forcing the thread into a signal
delivery stop whenever we need to do this, but this additional
complication is what made me think this approach is also not worthwhile.
This patch should fix (at least some of) the problems with
TestConcurrentVFork, but I've also added a dedicated test for checking
that a process keeps running after we detach. Although the problem I'm
fixing here is linux-specific, the core functinoality of not stopping
after a detach should function the same way everywhere.
---
.../Process/Linux/NativeProcessLinux.cpp | 4 ++
.../commands/process/detach-resumes/Makefile | 4 ++
.../detach-resumes/TestDetachResumes.py | 59 +++++++++++++++++++
.../commands/process/detach-resumes/main.cpp | 48 +++++++++++++++
.../concurrent_vfork/TestConcurrentVFork.py | 16 -----
5 files changed, 115 insertions(+), 16 deletions(-)
create mode 100644 lldb/test/API/commands/process/detach-resumes/Makefile
create mode 100644 lldb/test/API/commands/process/detach-resumes/TestDetachResumes.py
create mode 100644 lldb/test/API/commands/process/detach-resumes/main.cpp
diff --git a/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp b/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
index 5d2b4b03fe60cb..59fc8726b76739 100644
--- a/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
+++ b/lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
@@ -1089,6 +1089,10 @@ Status NativeProcessLinux::Detach() {
if (GetID() == LLDB_INVALID_PROCESS_ID)
return error;
+ // Cancel out any SIGSTOPs we may have sent while stopping the process.
+ // Otherwise, the process may stop as soon as we detach from it.
+ kill(GetID(), SIGCONT);
+
for (const auto &thread : m_threads) {
Status e = Detach(thread->GetID());
if (e.Fail())
diff --git a/lldb/test/API/commands/process/detach-resumes/Makefile b/lldb/test/API/commands/process/detach-resumes/Makefile
new file mode 100644
index 00000000000000..c46619c6623481
--- /dev/null
+++ b/lldb/test/API/commands/process/detach-resumes/Makefile
@@ -0,0 +1,4 @@
+CXX_SOURCES := main.cpp
+ENABLE_THREADS := YES
+
+include Makefile.rules
diff --git a/lldb/test/API/commands/process/detach-resumes/TestDetachResumes.py b/lldb/test/API/commands/process/detach-resumes/TestDetachResumes.py
new file mode 100644
index 00000000000000..57727294ddc3d3
--- /dev/null
+++ b/lldb/test/API/commands/process/detach-resumes/TestDetachResumes.py
@@ -0,0 +1,59 @@
+"""
+Test that the process continues running after we detach from it.
+"""
+
+import lldb
+import time
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+
+
+class DetachResumesTestCase(TestBase):
+ NO_DEBUG_INFO_TESTCASE = True
+
+ def test_detach_resumes(self):
+ self.build()
+ exe = self.getBuildArtifact()
+
+ # The inferior will use this file to let us know it is ready to be
+ # attached.
+ sync_file_path = lldbutil.append_to_process_working_directory(
+ self, "sync_file_%d" % (int(time.time()))
+ )
+
+ # And this one to let us know it is running after we've detached from
+ # it.
+ exit_file_path = lldbutil.append_to_process_working_directory(
+ self, "exit_file_%d" % (int(time.time()))
+ )
+
+ popen = self.spawnSubprocess(
+ self.getBuildArtifact(exe), [sync_file_path, exit_file_path]
+ )
+ lldbutil.wait_for_file_on_target(self, sync_file_path)
+
+ self.runCmd("process attach -p " + str(popen.pid))
+
+ # Set a breakpoint at a place that will be called by multiple threads
+ # simultaneously. On systems (e.g. linux) where the debugger needs to
+ # send signals to suspend threads, these signals will race with threads
+ # hitting the breakpoint (and stopping on their own).
+ bpno = lldbutil.run_break_set_by_symbol(self, "break_here")
+
+ # And let the inferior know it can call the function.
+ self.runCmd("expr -- wait_for_debugger_flag = false")
+
+ self.runCmd("continue")
+
+ self.expect(
+ "thread list",
+ STOPPED_DUE_TO_BREAKPOINT,
+ substrs=["stopped", "stop reason = breakpoint"],
+ )
+
+ # Detach, the process should keep running after this, and not be stopped
+ # by the signals that the debugger may have used to suspend the threads.
+ self.runCmd("detach")
+
+ lldbutil.wait_for_file_on_target(self, exit_file_path)
diff --git a/lldb/test/API/commands/process/detach-resumes/main.cpp b/lldb/test/API/commands/process/detach-resumes/main.cpp
new file mode 100644
index 00000000000000..e8050fef2c3850
--- /dev/null
+++ b/lldb/test/API/commands/process/detach-resumes/main.cpp
@@ -0,0 +1,48 @@
+#include "pseudo_barrier.h"
+#include <chrono>
+#include <fcntl.h>
+#include <fstream>
+#include <stdio.h>
+#include <thread>
+#include <vector>
+
+pseudo_barrier_t barrier;
+
+constexpr size_t nthreads = 5;
+volatile bool wait_for_debugger_flag = true;
+
+void break_here() {}
+
+void tfunc() {
+ pseudo_barrier_wait(barrier);
+
+ break_here();
+}
+
+int main(int argc, char const *argv[]) {
+ lldb_enable_attach();
+
+ if (argc < 3)
+ return 1;
+
+ // Create a file to signal that this process has started up.
+ std::ofstream(argv[1]).close();
+
+ // And wait for it to attach.
+ for (int i = 0; i < 100 && wait_for_debugger_flag; ++i)
+ std::this_thread::sleep_for(std::chrono::seconds(1));
+
+ // Fire up the threads and have them call break_here() simultaneously.
+ pseudo_barrier_init(barrier, nthreads);
+ std::vector<std::thread> threads;
+ for (size_t i = 0; i < nthreads; ++i)
+ threads.emplace_back(tfunc);
+
+ for (std::thread &t : threads)
+ t.join();
+
+ // Create the file to let the debugger know we're running.
+ std::ofstream(argv[2]).close();
+
+ return 0;
+}
diff --git a/lldb/test/API/functionalities/fork/concurrent_vfork/TestConcurrentVFork.py b/lldb/test/API/functionalities/fork/concurrent_vfork/TestConcurrentVFork.py
index 1790bd497f4e6b..2dcbb728549fb4 100644
--- a/lldb/test/API/functionalities/fork/concurrent_vfork/TestConcurrentVFork.py
+++ b/lldb/test/API/functionalities/fork/concurrent_vfork/TestConcurrentVFork.py
@@ -48,8 +48,6 @@ def follow_child_helper(self, use_fork, call_exec):
self.expect("continue", patterns=[r"exited with status = 1[0-4]"])
@skipUnlessPlatform(["linux"])
- # See https://github.com/llvm/llvm-project/issues/85084.
- @skipIf(oslist=["linux"], archs=["aarch64", "arm"])
def test_follow_parent_vfork_no_exec(self):
"""
Make sure that debugging concurrent vfork() from multiple threads won't crash lldb during follow-parent.
@@ -58,8 +56,6 @@ def test_follow_parent_vfork_no_exec(self):
self.follow_parent_helper(use_fork=False, call_exec=False)
@skipUnlessPlatform(["linux"])
- # See https://github.com/llvm/llvm-project/issues/85084.
- @skipIf(oslist=["linux"], archs=["aarch64", "arm"])
def test_follow_parent_fork_no_exec(self):
"""
Make sure that debugging concurrent fork() from multiple threads won't crash lldb during follow-parent.
@@ -68,8 +64,6 @@ def test_follow_parent_fork_no_exec(self):
self.follow_parent_helper(use_fork=True, call_exec=False)
@skipUnlessPlatform(["linux"])
- # See https://github.com/llvm/llvm-project/issues/85084.
- @skipIf(oslist=["linux"], archs=["aarch64", "arm"])
def test_follow_parent_vfork_call_exec(self):
"""
Make sure that debugging concurrent vfork() from multiple threads won't crash lldb during follow-parent.
@@ -78,8 +72,6 @@ def test_follow_parent_vfork_call_exec(self):
self.follow_parent_helper(use_fork=False, call_exec=True)
@skipUnlessPlatform(["linux"])
- # See https://github.com/llvm/llvm-project/issues/85084.
- @skipIf(oslist=["linux"], archs=["aarch64", "arm"])
def test_follow_parent_fork_call_exec(self):
"""
Make sure that debugging concurrent vfork() from multiple threads won't crash lldb during follow-parent.
@@ -88,8 +80,6 @@ def test_follow_parent_fork_call_exec(self):
self.follow_parent_helper(use_fork=True, call_exec=True)
@skipUnlessPlatform(["linux"])
- # See https://github.com/llvm/llvm-project/issues/85084.
- @skipIf(oslist=["linux"], archs=["aarch64", "arm"])
def test_follow_child_vfork_no_exec(self):
"""
Make sure that debugging concurrent vfork() from multiple threads won't crash lldb during follow-child.
@@ -98,8 +88,6 @@ def test_follow_child_vfork_no_exec(self):
self.follow_child_helper(use_fork=False, call_exec=False)
@skipUnlessPlatform(["linux"])
- # See https://github.com/llvm/llvm-project/issues/85084.
- @skipIf(oslist=["linux"], archs=["aarch64", "arm"])
def test_follow_child_fork_no_exec(self):
"""
Make sure that debugging concurrent fork() from multiple threads won't crash lldb during follow-child.
@@ -108,8 +96,6 @@ def test_follow_child_fork_no_exec(self):
self.follow_child_helper(use_fork=True, call_exec=False)
@skipUnlessPlatform(["linux"])
- # See https://github.com/llvm/llvm-project/issues/85084.
- @skipIf(oslist=["linux"], archs=["aarch64", "arm"])
def test_follow_child_vfork_call_exec(self):
"""
Make sure that debugging concurrent vfork() from multiple threads won't crash lldb during follow-child.
@@ -118,8 +104,6 @@ def test_follow_child_vfork_call_exec(self):
self.follow_child_helper(use_fork=False, call_exec=True)
@skipUnlessPlatform(["linux"])
- # See https://github.com/llvm/llvm-project/issues/85084.
- @skipIf(oslist=["linux"], archs=["aarch64", "arm"])
def test_follow_child_fork_call_exec(self):
"""
Make sure that debugging concurrent fork() from multiple threads won't crash lldb during follow-child.
More information about the lldb-commits
mailing list