[PATCH] D92663: [clangd] Add hot-reload of compile_commands.json and compile_flags.txt

Wed Dec 16 06:19:44 PST 2020

sammccall added inline comments.

================
Comment at: clang-tools-extra/clangd/GlobalCompilationDatabase.cpp:206
+  auto Stat = FS.status(Path);
+  if (!Stat || !Stat->isRegularFile()) {
+    Size = NoFileCached;
----------------
adamcz wrote:
> Here's a fun thought: what if this is a network or otherwise magical filesystem (like a FUSE) and may be temporarily unavailable (due to expiring credentials, network issues, maybe your machine just woke up from suspend and needs to reconnect). 
> 
> Do you think it makes sense to check for things like ENOACCES, ENOKEY, ESTALE and treat them as transient?
Maybe in some cases, it depends a bit on the details...

The decision we're making here is really "retry on next access" vs "retry in 30 seconds".
The biggest point of the delay is that when loading background index shards (each corresponding to a source file), we check for each of them whether the stored compile commands are still valid (if not, we still load the shard, but schedule a rebuild). If we stat each possible cdb+config+etc each time, it significantly increases time before the index becomes available.

With this in mind, I don't think forcing ENOACCESS and ENOKEY to try every time is a good idea - the FS might have come back but probably not (user action is probably required). ESTALE sounds more plausible, but I don't know the details there. I'd probably avoid the complexity of this until we have concrete use cases where this is causing problems (e.g. ESTALE isn't portably available through std::error_code AFAIK).

In practice if the CDB is on an unavailable network FS then the sources/shards probably are too so we never really get here, though there can be exceptions

================
Comment at: clang-tools-extra/clangd/GlobalCompilationDatabase.cpp:217
+  if (!Buf || (*Buf)->getBufferSize() != Stat->getSize()) {
+    Size = NoFileCached;
+    elog("Failed to read {0}: {1}", Path,
----------------
adamcz wrote:
> Why throw away the information we have? Better keep the cached metadata around, transient error here could mean the build system is re-creating the file, but it may end up being identical to what we already read, it's just that someone re-run cmake or something.
Good point, done.

Of course this relies on us getting "lucky" and seeing two different sizes for the file between the calls. If it's being written then we could e.g. see it with size 0 on both stat and read, and then we'll throw away the content.

The way around this would be to e.g. call an unparseable file a transient error (reuse cached value), which requires some structural changes here I think. What do you think about that? (I think this would help with compile_commands.json but not compile_flags.txt)
Another way would be to handle a size of 0 as a special case (presumed invalid), which may catch the practical cases.

================
Comment at: clang-tools-extra/clangd/unittests/GlobalCompilationDatabaseTests.cpp:424
+    "directory": "{0}/foo",
+  }])json",
+                                                        testRoot());
----------------
adamcz wrote:
> Can you add a couple of calls to test the modtime and size comparison? Like, maybe change DBAZ to DFOO and see that it's picked up despite size being the same if mtime changed?
Done - I'd forgotten we had mtime support in TestFS. Also added some more comments to the tests as the list gets longer.

(While writing the original test, I had overwritten -DFOO with -DBAR and spent a while debugging why it didn't reload - oops!)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92663/new/

https://reviews.llvm.org/D92663