[clang-tools-extra] 045695f - [clangd] Print current request context along with the stack trace

Sam McCall via cfe-commits cfe-commits at lists.llvm.org
Mon Oct 25 15:58:20 PDT 2021


Author: Emma Blink
Date: 2021-10-26T00:58:09+02:00
New Revision: 045695f85cb8cdf9b8373123749efe1781f337b4

URL: https://github.com/llvm/llvm-project/commit/045695f85cb8cdf9b8373123749efe1781f337b4
DIFF: https://github.com/llvm/llvm-project/commit/045695f85cb8cdf9b8373123749efe1781f337b4.diff

LOG: [clangd] Print current request context along with the stack trace

Motivation:

At the moment it is hard to attribute a clangd crash to a specific request out of all in-flight requests that might be processed concurrently. So before we can act on production clangd crashes, we have to do quite some digging through the log tables populated by our in-house VSCode extension or sometimes even directly reach out to the affected developer. Having all the details needed to reproduce a crash printed alongside its stack trace has a potential to save us quite some time, that could better be spent on fixing the actual problems.

Implementation approach:

* introduce `ThreadCrashReporter` class that allows to set a temporary signal handler for the current thread
* follow RAII pattern to simplify printing context for crashes occurring within a particular scope
* hold `std::function` as a handler to allow capturing context to print
* set local `ThreadCrashReporter` within `JSONTransport::loop()` to print request JSON for main thread crashes, and in `ASTWorker::run()` to print the file paths, arguments and contents for worker thread crashes

`ThreadCrashReporter` currently allows only one active handler per thread, but the approach can be extended to support stacked handlers printing context incrementally.

Example output for main thread crashes:

  ```
  ...
  #15 0x00007f7ddc819493 __libc_start_main (/lib64/libc.so.6+0x23493)
  #16 0x000000000249775e _start (/home/emmablink/local/llvm-project/build/bin/clangd+0x249775e)
  Signalled while processing message:
  {"jsonrpc": "2.0", "method": "textDocument/didOpen", "params": {"textDocument": {"uri": "file:///home/emmablink/test.cpp", "languageId": "cpp", "version": 1, "text": "template <typename>\nclass Bar {\n  Bar<int> *variables_to_modify;\n  foo() {\n    for (auto *c : *variables_to_modify)\n      delete c;\n  }\n};\n"}}}
  ```

Example output for AST worker crashes:

  ```
  ...
  #41 0x00007fb18304c14a start_thread pthread_create.c:0:0
  #42 0x00007fb181bfcdc3 clone (/lib64/libc.so.6+0xfcdc3)
  Signalled during AST action:
  Filename: test.cpp
  Directory: /home/emmablink
  Command Line: /usr/bin/clang -resource-dir=/data/users/emmablink/llvm-project/build/lib/clang/14.0.0 -- /home/emmablink/test.cpp
  Version: 1
  Contents:
  template <typename>
  class Bar {
    Bar<int> *variables_to_modify;
    foo() {
      for (auto *c : *variables_to_modify)
        delete c;
    }
  };
  ```

Testing:

The unit test covers the thread-localitity and nesting aspects of `ThreadCrashReporter`. There might be way to set up a lit-based integration test that would spawn clangd, send a message to it, signal it immediately and check the standard output, but this might be prone to raceconditions.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D109506

Added: 
    clang-tools-extra/clangd/support/ThreadCrashReporter.cpp
    clang-tools-extra/clangd/support/ThreadCrashReporter.h
    clang-tools-extra/clangd/test/crash.test
    clang-tools-extra/clangd/unittests/ThreadCrashReporterTests.cpp

Modified: 
    clang-tools-extra/clangd/JSONTransport.cpp
    clang-tools-extra/clangd/TUScheduler.cpp
    clang-tools-extra/clangd/support/CMakeLists.txt
    clang-tools-extra/clangd/tool/ClangdMain.cpp
    clang-tools-extra/clangd/unittests/CMakeLists.txt

Removed: 
    


################################################################################
diff  --git a/clang-tools-extra/clangd/JSONTransport.cpp b/clang-tools-extra/clangd/JSONTransport.cpp
index 49cd4e0903e3d..fd405c192591b 100644
--- a/clang-tools-extra/clangd/JSONTransport.cpp
+++ b/clang-tools-extra/clangd/JSONTransport.cpp
@@ -10,9 +10,11 @@
 #include "support/Cancellation.h"
 #include "support/Logger.h"
 #include "support/Shutdown.h"
+#include "support/ThreadCrashReporter.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/Support/Errno.h"
 #include "llvm/Support/Error.h"
+#include "llvm/Support/Threading.h"
 #include <system_error>
 
 namespace clang {
@@ -109,6 +111,11 @@ class JSONTransport : public Transport {
         return llvm::errorCodeToError(
             std::error_code(errno, std::system_category()));
       if (readRawMessage(JSON)) {
+        ThreadCrashReporter ScopedReporter([&JSON]() {
+          auto &OS = llvm::errs();
+          OS << "Signalled while processing message:\n";
+          OS << JSON << "\n";
+        });
         if (auto Doc = llvm::json::parse(JSON)) {
           vlog(Pretty ? "<<< {0:2}\n" : "<<< {0}\n", *Doc);
           if (!handleMessage(std::move(*Doc), Handler))

diff  --git a/clang-tools-extra/clangd/TUScheduler.cpp b/clang-tools-extra/clangd/TUScheduler.cpp
index bbcda5bc0dfb9..8d696b00084fe 100644
--- a/clang-tools-extra/clangd/TUScheduler.cpp
+++ b/clang-tools-extra/clangd/TUScheduler.cpp
@@ -58,6 +58,7 @@
 #include "support/Logger.h"
 #include "support/MemoryTree.h"
 #include "support/Path.h"
+#include "support/ThreadCrashReporter.h"
 #include "support/Threading.h"
 #include "support/Trace.h"
 #include "clang/Frontend/CompilerInvocation.h"
@@ -76,6 +77,7 @@
 #include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/Threading.h"
+#include "llvm/Support/raw_ostream.h"
 #include <algorithm>
 #include <atomic>
 #include <chrono>
@@ -902,6 +904,40 @@ void ASTWorker::runWithAST(
   startTask(Name, std::move(Task), /*Update=*/None, Invalidation);
 }
 
+/// To be called from ThreadCrashReporter's signal handler.
+static void crashDumpCompileCommand(llvm::raw_ostream &OS,
+                                    const tooling::CompileCommand &Command) {
+  OS << "  Filename: " << Command.Filename << "\n";
+  OS << "  Directory: " << Command.Directory << "\n";
+  OS << "  Command Line:";
+  for (auto &Arg : Command.CommandLine) {
+    OS << " " << Arg;
+  }
+  OS << "\n";
+}
+
+/// To be called from ThreadCrashReporter's signal handler.
+static void crashDumpFileContents(llvm::raw_ostream &OS,
+                                  const std::string &Contents) {
+  // Avoid flooding the terminal with source code by default, but allow clients
+  // to opt in. Use an env var to preserve backwards compatibility of the
+  // command line interface, while allowing it to be set outside the clangd
+  // launch site for more flexibility.
+  if (getenv("CLANGD_CRASH_DUMP_SOURCE")) {
+    OS << "  Contents:\n";
+    OS << Contents << "\n";
+  }
+}
+
+/// To be called from ThreadCrashReporter's signal handler.
+static void crashDumpParseInputs(llvm::raw_ostream &OS,
+                                 const ParseInputs &FileInputs) {
+  auto &Command = FileInputs.CompileCommand;
+  crashDumpCompileCommand(OS, Command);
+  OS << "  Version: " << FileInputs.Version << "\n";
+  crashDumpFileContents(OS, FileInputs.Contents);
+}
+
 void PreambleThread::build(Request Req) {
   assert(Req.CI && "Got preamble request with null compiler invocation");
   const ParseInputs &Inputs = Req.Inputs;
@@ -932,6 +968,11 @@ void PreambleThread::build(Request Req) {
          FileName, Inputs.Version, LatestBuild->Version);
   }
 
+  ThreadCrashReporter ScopedReporter([&Inputs]() {
+    llvm::errs() << "Signalled while building preamble\n";
+    crashDumpParseInputs(llvm::errs(), Inputs);
+  });
+
   LatestBuild = clang::clangd::buildPreamble(
       FileName, *Req.CI, Inputs, StoreInMemory,
       [this, Version(Inputs.Version)](ASTContext &Ctx,
@@ -1288,6 +1329,11 @@ void ASTWorker::run() {
         Status.ASTActivity.Name = CurrentRequest->Name;
       });
       WithContext WithProvidedContext(ContextProvider(FileName));
+      ThreadCrashReporter ScopedReporter([this]() {
+        const char *Name = CurrentRequest ? CurrentRequest->Name.c_str() : "";
+        llvm::errs() << "Signalled during AST worker action: " << Name << "\n";
+        crashDumpParseInputs(llvm::errs(), FileInputs);
+      });
       CurrentRequest->Action();
     }
 
@@ -1613,6 +1659,11 @@ void TUScheduler::runWithPreamble(llvm::StringRef Name, PathRef File,
                Ctx = Context::current().derive(kFileBeingProcessed,
                                                std::string(File)),
                Action = std::move(Action), this]() mutable {
+    ThreadCrashReporter ScopedReporter([&Name, &Contents, &Command]() {
+      llvm::errs() << "Signalled during preamble action: " << Name << "\n";
+      crashDumpCompileCommand(llvm::errs(), Command);
+      crashDumpFileContents(llvm::errs(), Contents);
+    });
     std::shared_ptr<const PreambleData> Preamble;
     if (Consistency == PreambleConsistency::Stale) {
       // Wait until the preamble is built for the first time, if preamble

diff  --git a/clang-tools-extra/clangd/support/CMakeLists.txt b/clang-tools-extra/clangd/support/CMakeLists.txt
index 681505586ca9d..0c8c199dd4a4c 100644
--- a/clang-tools-extra/clangd/support/CMakeLists.txt
+++ b/clang-tools-extra/clangd/support/CMakeLists.txt
@@ -24,6 +24,7 @@ add_clang_library(clangdSupport
   MemoryTree.cpp
   Path.cpp
   Shutdown.cpp
+  ThreadCrashReporter.cpp
   Threading.cpp
   ThreadsafeFS.cpp
   Trace.cpp

diff  --git a/clang-tools-extra/clangd/support/ThreadCrashReporter.cpp b/clang-tools-extra/clangd/support/ThreadCrashReporter.cpp
new file mode 100644
index 0000000000000..e67900e6916d9
--- /dev/null
+++ b/clang-tools-extra/clangd/support/ThreadCrashReporter.cpp
@@ -0,0 +1,54 @@
+//===--- ThreadCrashReporter.cpp - Thread local signal handling --*- C++-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "support/ThreadCrashReporter.h"
+#include "llvm/Support/Signals.h"
+#include "llvm/Support/ThreadLocal.h"
+
+namespace clang {
+namespace clangd {
+
+static thread_local ThreadCrashReporter *CurrentReporter = nullptr;
+
+void ThreadCrashReporter::runCrashHandlers() {
+  // No signal handling is done here on top of what AddSignalHandler() does:
+  // on Windows the signal handling is implmented via
+  // SetUnhandledExceptionFilter() which is thread-directed, and on Unix
+  // platforms the handlers are only called for KillSigs out of which only
+  // SIGQUIT seems to be process-directed and would be delivered to any thread
+  // that is not blocking it, but if the thread it gets delivered to has a
+  // ThreadCrashReporter installed during the interrupt — it seems reasonable to
+  // let it run and print the thread's context information.
+
+  // Call the reporters in LIFO order.
+  ThreadCrashReporter *Reporter = CurrentReporter;
+  while (Reporter) {
+    Reporter->Callback();
+    Reporter = Reporter->Next;
+  }
+}
+
+ThreadCrashReporter::ThreadCrashReporter(SignalCallback ThreadLocalCallback)
+    : Callback(std::move(ThreadLocalCallback)), Next(nullptr) {
+  this->Next = CurrentReporter;
+  CurrentReporter = this;
+  // Don't reorder subsequent operations: whatever comes after might crash and
+  // we want the the crash handler to see the reporter values we just set.
+  std::atomic_signal_fence(std::memory_order_seq_cst);
+}
+
+ThreadCrashReporter::~ThreadCrashReporter() {
+  assert(CurrentReporter == this);
+  CurrentReporter = this->Next;
+  // Don't reorder subsequent operations: whatever comes after might crash and
+  // we want the the crash handler to see the reporter values we just set.
+  std::atomic_signal_fence(std::memory_order_seq_cst);
+}
+
+} // namespace clangd
+} // namespace clang

diff  --git a/clang-tools-extra/clangd/support/ThreadCrashReporter.h b/clang-tools-extra/clangd/support/ThreadCrashReporter.h
new file mode 100644
index 0000000000000..eaba6ccdbf24f
--- /dev/null
+++ b/clang-tools-extra/clangd/support/ThreadCrashReporter.h
@@ -0,0 +1,58 @@
+//===--- ThreadCrashReporter.h - Thread local signal handling ----*- C++-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_SUPPORT_THREADCRASHREPORTER_H
+#define LLVM_CLANG_TOOLS_EXTRA_CLANGD_SUPPORT_THREADCRASHREPORTER_H
+
+#include "llvm/ADT/FunctionExtras.h"
+
+namespace clang {
+namespace clangd {
+
+/// Allows setting per-thread abort/kill signal callbacks, to print additional
+/// information about the crash depending on which thread got signalled.
+class ThreadCrashReporter {
+public:
+  using SignalCallback = llvm::unique_function<void(void)>;
+
+  /// Registers the callback as the first one in thread-local callback chain.
+  ///
+  /// Asserts if the current thread's callback is already set. The callback is
+  /// likely to be invoked in a signal handler. Most LLVM signal handling is not
+  /// strictly async-signal-safe. However reporters should avoid accessing data
+  /// structures likely to be in a bad state on crash.
+  ThreadCrashReporter(SignalCallback ThreadLocalCallback);
+  /// Resets the current thread's callback to nullptr.
+  ~ThreadCrashReporter();
+
+  /// Moves are disabled to ease nesting and escaping considerations.
+  ThreadCrashReporter(ThreadCrashReporter &&RHS) = delete;
+  ThreadCrashReporter(const ThreadCrashReporter &) = delete;
+  ThreadCrashReporter &operator=(ThreadCrashReporter &&) = delete;
+  ThreadCrashReporter &operator=(const ThreadCrashReporter &) = delete;
+
+  /// Calls all currently-active ThreadCrashReporters for the current thread.
+  ///
+  /// To be called from sys::AddSignalHandler() callback. Any signal filtering
+  /// is the responsibility of the caller. While this function is intended to be
+  /// called from signal handlers, it is not strictly async-signal-safe - see
+  /// constructor comment.
+  ///
+  /// When several reporters are nested, the callbacks are called in LIFO order.
+  static void runCrashHandlers();
+
+private:
+  SignalCallback Callback;
+  /// Points to the next reporter up the stack.
+  ThreadCrashReporter *Next;
+};
+
+} // namespace clangd
+} // namespace clang
+
+#endif

diff  --git a/clang-tools-extra/clangd/test/crash.test b/clang-tools-extra/clangd/test/crash.test
new file mode 100644
index 0000000000000..1197e1ab07c37
--- /dev/null
+++ b/clang-tools-extra/clangd/test/crash.test
@@ -0,0 +1,5 @@
+# Overflow the recursive json parser, prevent `yes` error due to broken pipe and `clangd` SIGSEGV from being treated as a failure.
+# RUN: (yes '[' || :) | head -n 50000 | (clangd --input-style=delimited 2>&1 || :) | FileCheck %s
+#      CHECK: Signalled while processing message:
+# CHECK-NEXT: [
+# CHECK-NEXT: [

diff  --git a/clang-tools-extra/clangd/tool/ClangdMain.cpp b/clang-tools-extra/clangd/tool/ClangdMain.cpp
index 103966dd08620..2e48bfb433f0c 100644
--- a/clang-tools-extra/clangd/tool/ClangdMain.cpp
+++ b/clang-tools-extra/clangd/tool/ClangdMain.cpp
@@ -24,6 +24,7 @@
 #include "refactor/Rename.h"
 #include "support/Path.h"
 #include "support/Shutdown.h"
+#include "support/ThreadCrashReporter.h"
 #include "support/ThreadsafeFS.h"
 #include "support/Trace.h"
 #include "clang/Format/Format.h"
@@ -679,6 +680,8 @@ int main(int argc, char *argv[]) {
 
   llvm::InitializeAllTargetInfos();
   llvm::sys::PrintStackTraceOnErrorSignal(argv[0]);
+  llvm::sys::AddSignalHandler(
+      [](void *) { ThreadCrashReporter::runCrashHandlers(); }, nullptr);
   llvm::sys::SetInterruptFunction(&requestShutdown);
   llvm::cl::SetVersionPrinter([](llvm::raw_ostream &OS) {
     OS << versionString() << "\n"

diff  --git a/clang-tools-extra/clangd/unittests/CMakeLists.txt b/clang-tools-extra/clangd/unittests/CMakeLists.txt
index 8fc4cf414bdb1..3c17bcdbc17a4 100644
--- a/clang-tools-extra/clangd/unittests/CMakeLists.txt
+++ b/clang-tools-extra/clangd/unittests/CMakeLists.txt
@@ -88,6 +88,7 @@ add_unittest(ClangdUnitTests ClangdTests
   TestIndex.cpp
   TestTU.cpp
   TestWorkspace.cpp
+  ThreadCrashReporterTests.cpp
   TidyProviderTests.cpp
   TypeHierarchyTests.cpp
   URITests.cpp

diff  --git a/clang-tools-extra/clangd/unittests/ThreadCrashReporterTests.cpp b/clang-tools-extra/clangd/unittests/ThreadCrashReporterTests.cpp
new file mode 100644
index 0000000000000..0eb8d472c6702
--- /dev/null
+++ b/clang-tools-extra/clangd/unittests/ThreadCrashReporterTests.cpp
@@ -0,0 +1,78 @@
+///===- ThreadCrashReporterTests.cpp - Thread local signal handling tests -===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "support/ThreadCrashReporter.h"
+#include "support/Threading.h"
+#include "llvm/Support/Signals.h"
+#include "gtest/gtest.h"
+#include <csignal>
+#include <string>
+
+namespace clang {
+namespace clangd {
+
+namespace {
+
+static void infoSignalHandler() { ThreadCrashReporter::runCrashHandlers(); }
+
+TEST(ThreadCrashReporterTest, All) {
+#if defined(_WIN32)
+  // Simulate signals on Windows for unit testing purposes.
+  // The `crash.test` lit test checks the end-to-end integration.
+  auto SignalCurrentThread = []() { infoSignalHandler(); };
+#else
+  llvm::sys::SetInfoSignalFunction(&infoSignalHandler);
+  auto SignalCurrentThread = []() { raise(SIGUSR1); };
+#endif
+
+  AsyncTaskRunner Runner;
+  auto SignalAnotherThread = [&]() {
+    Runner.runAsync("signal another thread", SignalCurrentThread);
+    Runner.wait();
+  };
+
+  bool Called;
+  {
+    ThreadCrashReporter ScopedReporter([&Called]() { Called = true; });
+    // Check handler gets called when a signal gets delivered to the current
+    // thread.
+    Called = false;
+    SignalCurrentThread();
+    EXPECT_TRUE(Called);
+
+    // Check handler does not get called when another thread gets signalled.
+    Called = false;
+    SignalAnotherThread();
+    EXPECT_FALSE(Called);
+  }
+  // Check handler does not get called when the reporter object goes out of
+  // scope.
+  Called = false;
+  SignalCurrentThread();
+  EXPECT_FALSE(Called);
+
+  std::string Order = "";
+  {
+    ThreadCrashReporter ScopedReporter([&Order] { Order.push_back('a'); });
+    {
+      ThreadCrashReporter ScopedReporter([&Order] { Order.push_back('b'); });
+      SignalCurrentThread();
+    }
+    // Check that handlers are called in LIFO order.
+    EXPECT_EQ(Order, "ba");
+
+    // Check that current handler is the only one after the nested scope is
+    // over.
+    SignalCurrentThread();
+    EXPECT_EQ(Order, "baa");
+  }
+}
+
+} // namespace
+} // namespace clangd
+} // namespace clang


        


More information about the cfe-commits mailing list