[PATCH] D80900: [clangd] Use different FS in PreambleThread

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Jun 1 07:29:22 PDT 2020


sammccall added a comment.

TL;DR: I think there are three viable paths that we should consider:
A) minimal change: put the FSProvider in ParseInputs
B) this patch, but with more documentation and safety
C) fight to clean up VFS multithreading semantics

Assuming we don't want to block on C, I'm not sure whether A or B is conceptually better. But A is a smaller change and brings side benefits, so wondering if you see advantages in B.

---

You're right about the race condition of course and I'm sure this patch is correct.
But this rubs me the wrong way a bit - the API change doesn't seem clearly motivated either from a high level (we conceptually *want* a single FS to build a single version of code) or from a low level (what are we actually doing in parallel that isn't threadsafe?).

Looking at the low level perspective first. VFS has three classes threadsafety issues:

- it contains the "working directory" as mutable state, which isn't multi-threading friendly. We do set working directory concurrently with other work, but always to the same thing. We could factor our way around this by setting the working directory prior to any concurrency (to CDB.Directory) and ensuring we never set it again.
- the RealFilesystem in real-working-directory mode is a singleton so its use is thread-hostile. This isn't really relevant to us because we use emulated-working-directory mode and have distinct instances. But it probably informed VFS's lack of useful threading semantics.
- read operations aren't `const` or documented as threadsafe even though the canonical `RealFilesystem` is. In-tree implementations seem to be threadsafe, some out-of-tree ones are not :-( This definitely bites us here.

I think it would be fundamentally possible to use a single VFS here. We need to modify the VFS interface to make concurrent reads safe (and const, while here) and fix any implementations. This isn't trivial. I think it's worthwhile, but experience tells me different projects want different things from the VFS interface...

The high-level view: IMO FSProvider is basically a hack around VFS's thread-safety issues, rather than a fundamental part of the application - it's a filesystem that doesn't have an attached working directory and whose read operations (e.g. `getFileSystem()->status()`) are threadsafe. We could rename FSProvider -> ThreadsafeFS and getFileSystem() -> newCursor(WD) or something, and maybe we should. 
>From this perspective, this patch is changing two things that have little to do with each other: it's replacing FS -> ThreadsafeFS and it's moving the FS ownership from one revision of the file to the TUScheduler itself.

I'm not really sure whether the latter is a good or a bad idea. If we want to do it, the use of ParseInputs as the interface to TUScheduler is a bit confusing - we should at least document that FS isn't used (as we do for CompileCommand) and probably assert that it's null, too.

But maybe simpler would be to "just" change the VFS member in ParseInputs to a `const FSProvider*`. This keeps the design approximately what it is now, but simplifies the ParseInputs concept: it's now truly readonly data, and functions that consume it won't have effects that escape.



================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:629
     std::vector<std::string> CC1Args;
     std::unique_ptr<CompilerInvocation> Invocation = buildCompilerInvocation(
         Inputs, CompilerInvocationDiagConsumer, &CC1Args);
----------------
Yikes, ISTM we're getting the FS from Inputs here, and from this->FSProvider below. We should use one or the other (i.e. FSProvider only, following this patch) unless we're trying to do something really subtle...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80900/new/

https://reviews.llvm.org/D80900





More information about the cfe-commits mailing list