[PATCH] D42573: [wip] The new threading implementation

Fri Jan 26 11:49:55 PST 2018

sammccall added a comment.

OK, here's a braindump, probably nothing surprising after our offline discussion.

TL;DR: the only design I *know* is extensible in ways we care about is the one that basically shards Scheduler into a bunch of per-TU schedulers.
I think that's probably the way to go, but if you can show how to cleanly extend one of the other options, we should compare those on the merits.

I think if we do adopt and stick with the sharding design, then a public API that reflects it is slightly nicer overall.
But it's not that important to me, tastes may vary, and above all I don't want to block the API patch on coming to a decision on the question above.
So I think we should agree on a name and then land the API patch with the interface it has now.

================
Comment at: clangd/ClangdServer.cpp:192
+
 Scheduler::Scheduler(unsigned AsyncThreadsCount, bool StorePreamblesInMemory,
                      ASTParsedCallback ASTCallback)
----------------
So... scheduler and the queue abstraction.
This works pretty well for the functionality that's in this patch, which is at least as good as what's in the live version.
However I think we want to e.g. debounce diagnostic requests, and likely other such things, e.g. Indexing is a nice candidate for debouncing *and* preemption.

(Current diagnostics behavior: we compute and send diagnostics for the syntax error introduced by the first keystroke. This is wasted load, and increases latency for diagnostics on the *last* keystroke, since we have to wait for the previous one to finish)

So we should consider some other options...

================
Comment at: clangd/ClangdServer.cpp:193
 Scheduler::Scheduler(unsigned AsyncThreadsCount, bool StorePreamblesInMemory,
                      ASTParsedCallback ASTCallback)
+    : Data{StorePreamblesInMemory, std::make_shared<PCHContainerOperations>(),
----------------
Option 1: Put all timing-related logic in scheduler.

I think this probably means Scheduler itself gets a thread whose job it is to wake up at certain times and schedule or cancel tasks on the queues.
`update` would set an alarm to add a task on the queue, which would be invalidated by any subsequent update. The easiest way to implement this is probably to track a version number that increments on update calls and is captured by the alarm.

This gets complicated by reads after updates that haven't finished yet - you need force the update to  happen now, so you need to schedule the thing and prevent the alarm from scheduling it.
It seems like in general trying to do nontrivial things may end up with a bunch of interacting actors and shared data structures that are hard to reason about and verify.

================
Comment at: clangd/ClangdServer.cpp:194
                      ASTParsedCallback ASTCallback)
-    : Units(StorePreamblesInMemory, std::make_shared<PCHContainerOperations>(),
-            std::move(ASTCallback)),
+    : Data{StorePreamblesInMemory, std::make_shared<PCHContainerOperations>(),
+           std::move(ASTCallback)},
----------------
Option 2: Share timing-related additions between scheduler and queue, but keep the tasks opaque to the queue.
This means extending the queue abstraction e.g. instead of just tasks, understand (delay, priority, task) tuples.

This isn't all that different from option 1, but moving some of the pieces out of scheduler might reduce the ball of hair to a manageable size, and add some conceptual clarity.

================
Comment at: clangd/ClangdServer.cpp:195
+    : Data{StorePreamblesInMemory, std::make_shared<PCHContainerOperations>(),
+           std::move(ASTCallback)},
       Executor(AsyncThreadsCount) {}
----------------
Option 3: extend the responsibilities of Queue so this is naturally its concern.
It needs to understand the timing, dependencies between tasks, and it probably makes sense to know when they become redundant etc too.

The easiest way to express this is that Queue owns a thread and implements the run-loop for actions on a single TU, and Scheduler's operations basically delegate to Queue. 

FWIW, this can be transformed into a version without dedicated threads. First we write the thread version in this form:

```
runLoop() {
  while (!shutdown) {
    nextThing = decideWhatToDo(); // either action, or a sleep duration
    if (nextThing.action)
      nextThing.action();
    else
      interrupt.wait(nextThing.sleepDuration); // any update(), withAST etc interrupts
  }
}
```

I don't think it'd be hard to write a threadpool-based executor that runs "workloads" defined by their decideWhatToDo() function and their interactions with the `interrupt` condition variable. (I don't know exactly what a nice API would look like, but details...)

That said, I definitely think the thread/runloop based one is an easier place to start, and suspect it would be good enough.

================
Comment at: clangd/ClangdServer.cpp:196
+           std::move(ASTCallback)},
       Executor(AsyncThreadsCount) {}

----------------
My conclusion (feel free to disagree!)
We should not add timing functionality in the first patch, but we should strongly prefer a design that we know it can be added to.

 - Option 1 looks pretty messy/complex to me. If you want to go this route, I'd really like to see a prototype with timing stuff.
 - Option 2 looks maybe workable, I can't say. If you can describe the right extension to Queue, this might work. I worry about baking in an abstraction that's too general to have a simple implementation, or too specific and needs extension in the future, but let's discuss.
 - Option 3 seems to be obviously implementable with the timing extensions. It also seems pretty flexible to modify in the future, without high abstraction barriers between policy and implementation.

My preference/advice would be to take option 3 now, I think it's at least as good as the others and doesn't need additional research. Of course I'm not opposed if you want to dig into options 1/2 and show that they can work.

(Separately, I'd prefer to initially use std::thread rather than a threadpool abstraction for the reasons laid out in the previous comment. Especially for option 3, where the API for the threadpool equivalent isn't obvious or standard. But *mostly* these questions are orthogonal)

================
Comment at: clangd/threading/Cancellation.h:33
+
+  bool isCancelled() const {
+    assert(WasCancelled && "the object was moved");
----------------
This is a nice little class!

I think callsites will look more natural if this is spelled `operator bool()` - you can call the variable `Cancelled` and write `if (Cancelled)` and `Cancelled.set()`

Maybe there's a case for spelling `set` as `operator=` too, and asserting that the arg is true, but i'm less sure of that one!

Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D42573