[Lldb-commits] [lldb] cef1e07 - [lldb] Fix that the embedded Python REPL crashes if it receives SIGINT

Raphael Isemann via lldb-commits lldb-commits at lists.llvm.org
Fri Nov 12 05:19:48 PST 2021


Author: Raphael Isemann
Date: 2021-11-12T14:19:15+01:00
New Revision: cef1e07cc6d00b5b429d77133201e1f404a8023c

URL: https://github.com/llvm/llvm-project/commit/cef1e07cc6d00b5b429d77133201e1f404a8023c
DIFF: https://github.com/llvm/llvm-project/commit/cef1e07cc6d00b5b429d77133201e1f404a8023c.diff

LOG: [lldb] Fix that the embedded Python REPL crashes if it receives SIGINT

When LLDB receives a SIGINT while running the embedded Python REPL it currently
just crashes in `ScriptInterpreterPythonImpl::Interrupt` with an error such as
the one below:

```

Fatal Python error: PyThreadState_Get: the function must be called with the GIL
held, but the GIL is released (the current Python thread state is NULL)

```

The faulty code that causes this error is this part of `ScriptInterpreterPythonImpl::Interrupt`:
```
    PyThreadState *state = PyThreadState_GET();
    if (!state)
      state = GetThreadState();
    if (state) {
      long tid = state->thread_id;
      PyThreadState_Swap(state);
      int num_threads = PyThreadState_SetAsyncExc(tid, PyExc_KeyboardInterrupt);
```

The obvious fix I tried is to just acquire the GIL before this code is running
which fixes the crash but the `KeyboardInterrupt` we want to raise immediately
is actually just queued and would only be raised once the next line of input has
been parsed (which e.g. won't interrupt Python code that is currently waiting on
a timer or IO from what I can see). Also none of the functions we call here is
marked as safe to be called from a signal handler from what I can see, so we
might still end up crashing here with some bad timing.

Python 3.2 introduced `PyErr_SetInterrupt` to solve this and the function takes
care of all the details and avoids doing anything that isn't safe to do inside a
signal handler. The only thing we need to do is to manually setup our own fake
SIGINT handler that behaves the same way as the standalone Python REPL signal
handler (which raises a KeyboardInterrupt).

>From what I understand the old code used to work with Python 2 so I kept the old
code around until we officially drop support for Python 2.

There is a small gap here with Python 3.0->3.1 where we might still be crashing,
but those versions have reached their EOL more than a decade ago so I think we
don't need to bother about them.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D104886

Added: 
    lldb/test/API/iohandler/sigint/TestIOHandlerPythonREPLSigint.py

Modified: 
    lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPython.cpp

Removed: 
    


################################################################################
diff  --git a/lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPython.cpp b/lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPython.cpp
index 418e2239a771e..8655338941b5a 100644
--- a/lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPython.cpp
+++ b/lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPython.cpp
@@ -1067,6 +1067,23 @@ void ScriptInterpreterPythonImpl::ExecuteInterpreterLoop() {
 }
 
 bool ScriptInterpreterPythonImpl::Interrupt() {
+  // PyErr_SetInterrupt was introduced in 3.2.
+#if (PY_MAJOR_VERSION == 3 && PY_MINOR_VERSION >= 2) || (PY_MAJOR_VERSION > 3)
+  // If the interpreter isn't evaluating any Python at the moment then return
+  // false to signal that this function didn't handle the interrupt and the
+  // next component should try handling it.
+  if (!IsExecutingPython())
+    return false;
+
+  // Tell Python that it should pretend to have received a SIGINT.
+  PyErr_SetInterrupt();
+  // PyErr_SetInterrupt has no way to return an error so we can only pretend the
+  // signal got successfully handled and return true.
+  // Python 3.10 introduces PyErr_SetInterruptEx that could return an error, but
+  // the error handling is limited to checking the arguments which would be
+  // just our (hardcoded) input signal code SIGINT, so that's not useful at all.
+  return true;
+#else
   Log *log(lldb_private::GetLogIfAllCategoriesSet(LIBLLDB_LOG_SCRIPT));
 
   if (IsExecutingPython()) {
@@ -1088,6 +1105,7 @@ bool ScriptInterpreterPythonImpl::Interrupt() {
             "ScriptInterpreterPythonImpl::Interrupt() python code not running, "
             "can't interrupt");
   return false;
+#endif
 }
 
 bool ScriptInterpreterPythonImpl::ExecuteOneLineWithReturn(
@@ -3293,6 +3311,28 @@ ScriptInterpreterPythonImpl::AcquireInterpreterLock() {
   return py_lock;
 }
 
+namespace {
+/// Saves the current signal handler for the specified signal and restores
+/// it at the end of the current scope.
+struct RestoreSignalHandlerScope {
+  /// The signal handler.
+  struct sigaction m_prev_handler;
+  int m_signal_code;
+  RestoreSignalHandlerScope(int signal_code) : m_signal_code(signal_code) {
+    // Initialize sigaction to their default state.
+    std::memset(&m_prev_handler, 0, sizeof(m_prev_handler));
+    // Don't install a new handler, just read back the old one.
+    struct sigaction *new_handler = nullptr;
+    int signal_err = ::sigaction(m_signal_code, new_handler, &m_prev_handler);
+    lldbassert(signal_err == 0 && "sigaction failed to read handler");
+  }
+  ~RestoreSignalHandlerScope() {
+    int signal_err = ::sigaction(m_signal_code, &m_prev_handler, nullptr);
+    lldbassert(signal_err == 0 && "sigaction failed to restore old handler");
+  }
+};
+} // namespace
+
 void ScriptInterpreterPythonImpl::InitializePrivate() {
   if (g_initialized)
     return;
@@ -3328,6 +3368,25 @@ void ScriptInterpreterPythonImpl::InitializePrivate() {
                      "lldb.embedded_interpreter; from "
                      "lldb.embedded_interpreter import run_python_interpreter; "
                      "from lldb.embedded_interpreter import run_one_line");
+
+#if (PY_MAJOR_VERSION == 3 && PY_MINOR_VERSION >= 2) || (PY_MAJOR_VERSION > 3)
+  // Python will not just overwrite its internal SIGINT handler but also the
+  // one from the process. Backup the current SIGINT handler to prevent that
+  // Python deletes it.
+  RestoreSignalHandlerScope save_sigint(SIGINT);
+
+  // Setup a default SIGINT signal handler that works the same way as the
+  // normal Python REPL signal handler which raises a KeyboardInterrupt.
+  // Also make sure to not pollute the user's REPL with the signal module nor
+  // our utility function.
+  PyRun_SimpleString("def lldb_setup_sigint_handler():\n"
+                     "  import signal;\n"
+                     "  def signal_handler(sig, frame):\n"
+                     "    raise KeyboardInterrupt()\n"
+                     "  signal.signal(signal.SIGINT, signal_handler);\n"
+                     "lldb_setup_sigint_handler();\n"
+                     "del lldb_setup_sigint_handler\n");
+#endif
 }
 
 void ScriptInterpreterPythonImpl::AddToSysPath(AddLocation location,

diff  --git a/lldb/test/API/iohandler/sigint/TestIOHandlerPythonREPLSigint.py b/lldb/test/API/iohandler/sigint/TestIOHandlerPythonREPLSigint.py
new file mode 100644
index 0000000000000..7b01919d39526
--- /dev/null
+++ b/lldb/test/API/iohandler/sigint/TestIOHandlerPythonREPLSigint.py
@@ -0,0 +1,73 @@
+"""
+Test sending SIGINT to the embedded Python REPL.
+"""
+
+import os
+
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test.lldbpexpect import PExpectTest
+
+class TestCase(PExpectTest):
+
+    mydir = TestBase.compute_mydir(__file__)
+
+    def start_python_repl(self):
+        """ Starts up the embedded Python REPL."""
+        self.launch()
+        # Start the embedded Python REPL via the 'script' command.
+        self.child.send("script -l python --\n")
+        # Wait for the Python REPL prompt.
+        self.child.expect(">>>")
+
+    # PExpect uses many timeouts internally and doesn't play well
+    # under ASAN on a loaded machine..
+    @skipIfAsan
+    def test_while_evaluating_code(self):
+        """ Tests SIGINT handling while Python code is being evaluated."""
+        self.start_python_repl()
+
+        # Start a long-running command that we try to abort with SIGINT.
+        # Note that we dont actually wait 10000s in this code as pexpect or
+        # lit will kill the test way before that.
+        self.child.send("import time; print('running' + 'now'); time.sleep(10000);\n")
+
+        # Make sure the command is actually being evaluated at the moment by
+        # looking at the string that the command is printing.
+        # Don't check for a needle that also occurs in the program itself to
+        # prevent that echoing will make this check pass unintentionally.
+        self.child.expect("runningnow")
+
+        # Send SIGINT to the LLDB process.
+        self.child.sendintr()
+
+        # This should get transformed to a KeyboardInterrupt which is the same
+        # behaviour as the standalone Python REPL. It should also interrupt
+        # the evaluation of our sleep statement.
+        self.child.expect("KeyboardInterrupt")
+        # Send EOF to quit the Python REPL.
+        self.child.sendeof()
+
+        self.quit()
+
+    # PExpect uses many timeouts internally and doesn't play well
+    # under ASAN on a loaded machine..
+    @skipIfAsan
+    # FIXME: On Linux the Python code that reads from stdin seems to block until
+    # it has finished reading a line before handling any queued signals.
+    @skipIfLinux
+    def test_while_waiting_on_input(self):
+        """ Tests SIGINT handling while the REPL is waiting on input from
+        stdin."""
+        self.start_python_repl()
+
+        # Send SIGINT to the LLDB process.
+        self.child.sendintr()
+        # This should get transformed to a KeyboardInterrupt which is the same
+        # behaviour as the standalone Python REPL.
+        self.child.expect("KeyboardInterrupt")
+        # Send EOF to quit the Python REPL.
+        self.child.sendeof()
+
+        self.quit()


        


More information about the lldb-commits mailing list