[llvm] 718fbbe - [llvm-exegesis] Kill process that recieve a signal (#86069)

via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 21 18:14:22 PDT 2024


Author: Aiden Grossman
Date: 2024-03-21T18:14:18-07:00
New Revision: 718fbbef5f18a2b7e7fc4f842b1452ae9bee581a

URL: https://github.com/llvm/llvm-project/commit/718fbbef5f18a2b7e7fc4f842b1452ae9bee581a
DIFF: https://github.com/llvm/llvm-project/commit/718fbbef5f18a2b7e7fc4f842b1452ae9bee581a.diff

LOG: [llvm-exegesis] Kill process that recieve a signal (#86069)

Before this patch, llvm-exegesis would leave processes lingering that
experienced signals like segmentation faults. They would up in a
signal-delivery-stop state under the ptrace and never exit. This does
not cause problems (or at least many) in llvm-exegesis as they are
cleaned up after the main process exits, which usually happens quickly.
However, in downstream use, when many blocks are being executed (many of
which run into signals) within a single process, these processes stay
around and can easily exhaust the process limit on some systems.

This patch cleans them up by sending SIGKILL after information about the
signal that was sent has been gathered.

Added: 
    

Modified: 
    llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
index 5c9848f3c68885..f0452605eb24bf 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
@@ -342,7 +342,7 @@ class SubProcessFunctionExecutorImpl
       return make_error<Failure>("Failed to attach to the child process: " +
                                  Twine(strerror(errno)));
 
-    if (wait(NULL) == -1) {
+    if (waitpid(ParentOrChildPID, NULL, 0) == -1) {
       return make_error<Failure>(
           "Failed to wait for child process to stop after attaching: " +
           Twine(strerror(errno)));
@@ -361,7 +361,7 @@ class SubProcessFunctionExecutorImpl
       return SendError;
 
     int ChildStatus;
-    if (wait(&ChildStatus) == -1) {
+    if (waitpid(ParentOrChildPID, &ChildStatus, 0) == -1) {
       return make_error<Failure>(
           "Waiting for the child process to complete failed: " +
           Twine(strerror(errno)));
@@ -401,6 +401,20 @@ class SubProcessFunctionExecutorImpl
                                  Twine(strerror(errno)));
     }
 
+    // Send SIGKILL rather than SIGTERM as the child process has no SIGTERM
+    // handlers to run, and calling SIGTERM would mean that ptrace will force
+    // it to block in the signal-delivery-stop for the SIGSEGV/other signals,
+    // and upon exit.
+    if (kill(ParentOrChildPID, SIGKILL) == -1)
+      return make_error<Failure>("Failed to kill child benchmarking proces: " +
+                                 Twine(strerror(errno)));
+
+    // Wait for the process to exit so that there are no zombie processes left
+    // around.
+    if (waitpid(ParentOrChildPID, NULL, 0) == -1)
+      return make_error<Failure>("Failed to wait for process to die: " +
+                                 Twine(strerror(errno)));
+
     if (ChildSignalInfo.si_signo == SIGSEGV)
       return make_error<SnippetSegmentationFault>(
           reinterpret_cast<intptr_t>(ChildSignalInfo.si_addr));


        


More information about the llvm-commits mailing list