[PATCH] D109506: [RFC] Print current request context along with the stack trance in clangd

Fri Sep 10 07:45:18 PDT 2021

0x1eaf added a comment.

Thank you for such a prompt and thorough review!

Please consider all the comments I haven't replied to acknowledged and I'm going to resolve the issues with the next update.

================
Comment at: clang-tools-extra/clangd/JSONTransport.cpp:117
+          OS << "Signalled while processing message:\n";
+          OS << JSON << "\r\n";
+        });
----------------
sammccall wrote:
> why \r?
the message itself is terminated with `\r\n` in the protocol and I just mimicked it for no good reason — will correct

================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:1381
+  auto &OS = llvm::errs();
+  OS << "Signalled during AST action:\n";
+  auto &Command = FileInputs.CompileCommand;
----------------
sammccall wrote:
> the name of the action is available as CurrentRequest->Action, which I think is safe to either pass into this function or read directly here.
I've noticed the the action name is always hardcoded as “Build AST ({version})” and just logged the version directly below, but I guess it's not forward compatible, so I'm going to log both…

================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:1392
+  OS << "Contents:\n";
+  OS << FileInputs.Contents << "\n";
+}
----------------
sammccall wrote:
> sammccall wrote:
> > hmm, this is potentially really useful (could gather full reproducers) but also is going to dominate the output and make it annoying to see the stack traces/other details. I'm worried this may make this feature net-negative for a majority of users, who won't find the right parts of the output to report...
> > 
> > How critical do you think this is for debugging crashes in practice? (I suspect fairly important!)
> > Since we're taking liberties with async-signal-safety anyway, it might be possible to get this sent to a tempfile, which might be more useful. In any case, we might want full file dumps to be off by default outside of environments where anyone is likely to actually try to get the reproducers.
> > 
> > I'd probably suggest leaving this out of the initial patch entirely so we can focus on getting the rest of the mechanism right.
> another thought along the lines of reproducers (but nothing for this patch)...
> 
> If we're working on a file and have a preamble, then the headers that make up the preamble are relevant. In practice it's hard to reproduce bug reports we get because we need all their headers. If we're using a preamble then just dumping all the filenames it includes would make it possible for some tool to take the crash report and zip up a reproducer.
> 
> (This probably needs the ability to run all of the handlers for the current thread, not just the top of the stack, since we acquire the relevant preamble later)
> How critical do you think this is for debugging crashes in practice?

For us it would likely be of utmost importance as our VSCode extension routes requests to [Glean](https://glean.software) until local clangd finishes the compilation, so clangd ends up mostly being used for modified files, and the crashes might be unreproducible on the original source…

> I'd probably suggest leaving this out of the initial patch entirely so we can focus on getting the rest of the mechanism right.

Alright, will remove the file contents with the next update.

> In any case, we might want full file dumps to be off by default outside of environments where anyone is likely to actually try to get the reproducers.

What would be the best way to re-add it in a subsequent revision: make it configurable via a command line option, or via an environment variable? And if it's just an env var, maybe we could include it conditionally in this revision?

> If we're using a preamble then just dumping all the filenames it includes would make it possible for some tool to take the crash report and zip up a reproducer.

We had discussions within the team about potentially implementing a similar tool, but the rough idea was to rely on VCS to collect the changes as the delta would normally be smaller than bundling all the headers that the TU includes. But if the VCS-agnostic approach would be simpler and more portable — I'm all for it :)

================
Comment at: clang-tools-extra/clangd/support/ThreadSignalHandler.cpp:15
+
+static llvm::sys::ThreadLocal<ThreadSignalHandlerCallback> CurrentCallback;
+
----------------
sammccall wrote:
> sammccall wrote:
> > sammccall wrote:
> > > we assume `thread_local` in clangd (see support/Context.ccp)
> > > 
> > > It's possible that llvm::sys::ThreadLocal happens to play nicer with signal handlers in practice (none of these are completely safe!), but unless we've seen some concrete evidence I'd rather just have `static thread_local ThreadSignalHandlerCallback*` here
> > we can quite easily support a stack of handlers, such that we dump *all* living handlers on the crashing thread.
> > 
> > just give each ThreadSignalHandler a `ThreadSignalHandler* Next = nullptr` member, and both the constructor and destructor `swap(Next, CurrentCallback)`. Then CurrentCallback points to the head of a linked list of handlers.
> > 
> > Not sure if you want to have that in this patch, but it's pretty simple so I wouldn't object.
> Sorry, just realized you already suggested the stacking in the patch description!
Having `Next` pointer chains makes me worried about out of order destruction, but if we are removing the move constructor and making the crash handler object scope-bound always, that could actually work. But if we want to keep the possibility of storing the handler in an object to preserve it across method calls, I guess we could just make this a doubly-linked list and cut out the link gracefully.

================
Comment at: clang-tools-extra/clangd/support/ThreadSignalHandler.cpp:36
+  registerSignalHandlerIfNeeded();
+  CurrentCallback.set(&Callback);
+}
----------------
sammccall wrote:
> we could consider `atomic_signal_fence` around these mutations, or making CurrentCallback atomic.
> 
> It doesn't get us out of all possible write-reordering problems (we could see inconsistent states of the data referred to by the handler) but may avoid some catastrophic failures
sure, will add a `seq_cst` one at the end, same as in `setCrashLogMessage()`

================
Comment at: clang-tools-extra/clangd/support/ThreadSignalHandler.h:16
+
+class ThreadSignalHandler {
+public:
----------------
sammccall wrote:
> I understand framing this as general signal handler machinery, but I think it might be easier to reason about if we give it a name more specific to its purpose. Like maybe `ThreadCrashReporter`?
> 
> The main reason is that in the general case signal handling calls to mind a bunch of concerns like async safety that we're able to be a bit cavalier about in the specific case where the signal is (probably) handled synchronously, we're crashing anyway, etc. We also don't really want to bother exposing info like *which* signal is being handled, but a "signal handler" interface looks strange without it.
makes total sense, will rename it

================
Comment at: clang-tools-extra/clangd/support/ThreadSignalHandler.h:21
+
+  ThreadSignalHandler(ThreadSignalHandler &&);
+  ThreadSignalHandler(const ThreadSignalHandler &) = delete;
----------------
sammccall wrote:
> I'm not quite clear on why we need to support move-construction (and the move-constructor's interaction with static state!)
I just parroted the `scope_exit` implementation — can remove it

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109506/new/

https://reviews.llvm.org/D109506