[PATCH] D92663: [clangd] Add hot-reload of compile_commands.json and compile_flags.txt

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Dec 9 09:47:04 PST 2020


sammccall marked 7 inline comments as done.
sammccall added a comment.

Thanks for the review!

> my biggest question is, have we considered updating CompilationDatabasePlugin interface to provide a loadFromBuffer and relativeFileName virtual methods. I think loading logic could've been a lot simpler and more uniform that way.

I thought about it, not sure if we discussed it. TL;DR: I feel like we can do a good job of the concrete cases, but don't feel ready to generalize.

It's a larger, complicated design problem and the costs of getting it wrong (either making it too limiting *or* too flexible) are much higher than the benefits we'd get here. Apart from JSON/Fixed CDBs, we know there are downstream CDBs and this is a key extension point. What assumptions should we bake in?

- CDB state is derived from a single file with known path?
- CDBs should be thrown away and reloaded, rather than updated, when data changes?
- CDBs are not invalidated by any other events?

I like the idea of extending the interface at some point. But I'm not sure the model from this patch is the right one to codify.

Meanwhile, most clang tools are batch, so the audience for this is... clangd and ccls? I don't think it's a pressing need.



================
Comment at: clang-tools-extra/clangd/GlobalCompilationDatabase.cpp:205
+  auto Stat = FS.status(Path);
+  if (!Stat || !Stat->isRegularFile()) {
+    Size = NoFileCached;
----------------
kadircet wrote:
> what about symlinks ? i think users usually have `cc.json -> obscure_build_dir/cc.json` as a symlink.
`RealFileSystem::status()` is `stat()` rather than `lstat()` (it uses the default `follow=true` for `sys::fs::status`) so it will follow symlinks and report "regular file".


================
Comment at: clang-tools-extra/clangd/GlobalCompilationDatabase.cpp:322
+    auto Plugin = Entry.instantiate();
+    if (auto CDB = Plugin->loadFromDirectory(Path, Error)) {
+      log("Loaded compilation database from {0} with plugin {1}", Path,
----------------
kadircet wrote:
> it is not crucial but we used to try to load these from `Path + "/build"` as well
Yeah, this is on purpose, I think it's better only to do this for compile_commands.json. Now that we're periodically stat'ing we should think a little about how many files are necessary.

The build/ heuristic is basically targetting the "standard cmake setup" which is always compile_commands.json.

I think it's unlikely this is a significant regression for anyone. In a standard build it only affects build/compile_flags.txt which AFAIK is always hand-authored, and I'd expect authors to put it in the project root (we never documented the build/ behavior, and other clang tools don't have it).

FWIW, the old implementation looked in `build/` for all plugins only because I wanted to stick to the high-level API rather than piercing the abstraction.

---

As an aside, I am slightly worried this hot-reload feature will make us wary of supporting more build systems that require more IO, because both more periodic stats and having some CDBs only scanned once would be unpalatable options.

Still, given how dominant compile_commands.json is today, we can't let the extensibility argument get in the way of an important feature like this.


================
Comment at: clang-tools-extra/clangd/GlobalCompilationDatabase.cpp:456
     CDBLookupResult Result) const {
+  vlog("Broadcasting compilation database from {0}", Result.PI.SourceRoot);
   assert(Result.CDB && "Trying to broadcast an invalid CDB!");
----------------
kadircet wrote:
> why not `log` instead of `vlog`? this shouldn't be too noisy anyway.
Basically because "broacdasting" per se isn't meaningful to users.
This log line is going to be sandwiched between "Loaded compilation database from {0}" and "Enqueueing {0} commands for indexing", which tell them what they need to know.

I'm not sure it adds much, and maybe we should just remove it entirely? (But as a developer I do find it a nice reminder of what the flow is)


================
Comment at: clang-tools-extra/clangd/unittests/GlobalCompilationDatabaseTests.cpp:393
+
+auto HasFlag(llvm::StringRef Flag) { return HasFlag(Flag, "dummy.cc"); }
+
----------------
kadircet wrote:
> nit: maybe a `MATCHER_P` instead ?
Well, I find functions easier to reason about than the MATCHER_P factories...

Here it's pretty ugly: if we want to delegate to the two-arg version from the body, we have to pass through the result_listener, so something like
```
MATCHER_P(HasFlag, Flag, "") {
  return HasFlag(Flag, "dummy.cc").MatchAndExplain(arg, result_listener);
}
```

however the MATCHER_P macros produce *polymorphic* matchers (which is why we don't need to provide the arg type), so we need something more like

```
return Matcher<decltype(arg)>(HasFlag(Flag, "dummy.cc")).MatchAndExplain(arg, result_listener);
```

(Not sure if that's quite right)
I don't think that does a great job of communicating "this is the same HasFlag matcher, but the Path arg is dummy.cc".



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92663/new/

https://reviews.llvm.org/D92663



More information about the cfe-commits mailing list