[PATCH] D42174: [clangd] Refactored threading in ClangdServer

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Jan 22 09:13:20 PST 2018


sammccall added a comment.
Herald added a subscriber: ioeric.

So, I simultaneously think this is basically ready to land, and I want substantial changes :-)

This is much better already than what we have, and where I think we can further improve the design, this is a natural point on the way.

My preferred next steps would be:

- we discuss a little bit the directions I've outlined below, without the assumption that they belong in this patch.
- to the extent you agree, you make the less-invasive changes here (e.g. adding a VersionConstraint API, but only actually supporting the cases you've implemented)
- once we're on the same page for the other stuff, I'm happy to pick up any amount of it myself - whatever's not going to step on your toes



================
Comment at: clangd/ClangdServer.cpp:37
+template <class Ret, class Func>
+Ret waitForASTAction(Scheduler &S, PathRef File, Func &&F) {
+  std::packaged_task<Ret(llvm::Expected<ParsedAST &>)> Task(
----------------
Would be nice to have parallel names to the Scheduler methods, e.g. blockingASTRead() and blockingPreambleRead()


================
Comment at: clangd/ClangdServer.cpp:222
+  }
+  // We currently do all the reads of the AST on a running thread to avoid
+  // inconsistent states coming from subsequent updates.
----------------
Having trouble with this one.
is this the same as "We currently block the calling thread until the AST is available, to avoid..."?


================
Comment at: clangd/ClangdServer.h:164
 
+/// Handles running tasks for ClangdServer and managing the resources (e.g.,
+/// preambles and ASTs) for opened files.
----------------
This is a nice abstraction, so much better than dealing with Cppfile! A couple of observations:

1) all methods refer to a single file, neither the contracts nor the implementation have any interactions between files. An API that reflects this seems more natural. e.g. it naturally provides operations like "are we tracking this file", and makes it natural to be able to e.g. lock at the per-file level. e.g.

  class WorkingSet {
     shared_ptr<TranslationUnit> get(Path);
     shared_ptr<TranslationUnit> getOrCreate(Path)
   }
   class TranslationUnit {
     void update(ParseInputs);
     void read(Callback);
   }

This seems like it would make it easier to explain what the semantics of scheduleRemove are, too.

2) The callbacks from individual methods seem more powerful than needed, and encourage deeper coupling. Basically, the inputs (changes) and outputs (diagnostics, reads) don't seem to want to interact with the same code. This suggests decoupling them: the changes are a sequence of input events, diagnostics are a sequence of output events, reads look much as they do now.

One benefit here is that the versioning relationship between inputs and outputs no longer show up in the function signatures (by coupling an input to a matching output). Expressing them as data makes it easier to tweak them.

3) It's not spelled out how this interacts with drafts: whether text<->version is maintained here or externally, and what the contracts around versions are. There are no options offered, so I would guess that scheduleUpdate delivers new-version-or-nothing, and scheduleASTRead delivers... current-or-newer? current-or-nothing?

I think it would *probably* be clearer to have versions minted by the external DraftStore, that way we can decouple "we know about the contents of this file" from "we're building this file". e.g. we probably want wildly different policies for discarding resources of old versions, when "resources" = source code vs "resources" = ASTs and preambles.

4) Scheduler (or anything it decomposes into) is important and isolated enough that it deserves its own header.


================
Comment at: clangd/ClangdServer.h:169
+  Scheduler(unsigned AsyncThreadsCount, bool StorePreamblesInMemory,
+            std::shared_ptr<PCHContainerOperations> PCHs,
+            ASTParsedCallback ASTCallback);
----------------
this is a performance optimization, right?
I think the scheduler does enough of the compiling that giving it one of its own isn't too wasteful.


================
Comment at: clangd/ClangdServer.h:173
+  /// Get the CompileCommand passed at the last scheduleUpdate call for \p File.
+  llvm::Optional<tooling::CompileCommand> getLatestCommand(PathRef File) const;
+
----------------
This seems like a smell - the command is sometimes an input from callsites that want builds, and sometimes an output to callsites that want builds. Also this class is all about threading, scheduling, and versions, and this method bypasses all of that.

(If this is a hack to keep code complete working while we refactor, that's fine - but the docs should say that and we should know what next steps look like!)

ISTM the issue here is coupling updates to the source with updates to the compile command.
Ultimately we should indeed be compiling a given version with a fixed CC, but that doesn't seem like the clearest interface for callers.

One option is:
  - Scheduler has a reference to the CDB (or a std::function wrapping it), and needs to come up with commands itself
  - it caches commands whenever it can, and has an "invalidateCommand(Path)" to drop its cache. `with_file_lock { invalidateCommand(), scheduleUpdate() }` is the equivalent to `forceRebuild`.
 - it provides a `scheduleSourceRead` which is like scheduleASTRead but provides source, compile command, and the latest preamble available without blocking. This would be used for operations like codecomplete that can't use scheduleASTRead.


================
Comment at: clangd/ClangdServer.h:184
+  /// resources. \p Action will be called when resources are freed.
+  /// If an error occurs during processing, it is forwarded to the \p Action
+  /// callback.
----------------
The callback here is a bit confusing and YAGNI (both clangd and our known embedders discard the returned future).
It seems enough to return synchronously and know that subsequent reads that ask for = or >= version are going to get nothing back.


================
Comment at: clangd/ClangdServer.h:200
+  /// callback.
+  void schedulePreambleRead(
+      PathRef File,
----------------
can we pass in a consistency policy here? even if we only support a subset of {ast, preamble} x policy for now, a richer API would enable experiments/tradeoffs later. e.g.

  enum class VersionConstraint {
    ANY,     // returns the first data that's available, typically immediately
    EQUAL, // returns the data for the previous update
    GREATER_EQUAL, // Returns the data for the previous update, or some subsequent update.
  }

(I guess the semantics you've described for AST read are EQUAL, and ANY for preamble read)


================
Comment at: clangd/ClangdServer.h:206
+  CppFileCollection Units;
+  SimpleThreadPool Executor;
+};
----------------
these two names seem a bit confusing - might be easier as just CppFiles, Threads?

The names seem to be ~synonyms for the types, which I don't think is better than echoing the types. (Often there's a completely separate good name, but I don't think so here)


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D42174





More information about the cfe-commits mailing list