[PATCH] D82964: [clangd] Config: loading and caching config from disk.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jul 2 09:42:31 PDT 2020


sammccall marked 15 inline comments as done.
sammccall added inline comments.


================
Comment at: clang-tools-extra/clangd/ConfigProvider.cpp:39
+            std::vector<CompiledFragment> &Out) {
+    assert(llvm::sys::path::is_absolute(Path));
+    auto FS = TFS.view(/*CWD=*/llvm::None);
----------------
kadircet wrote:
> I believe this function is going to be hot, I would suggest putting a span here to track the latencies
Tracing isn't trivially cheap: that usually involves taking a mutex, writing to a buffer, and I think flushing it to disk.

So we shouldn't trace things that are both hot and small, which I think this is.

I added a tracer to Provider::getConfig - OK to revisit if we need something more fine-grained?


================
Comment at: clang-tools-extra/clangd/ConfigProvider.cpp:41
+    auto FS = TFS.view(/*CWD=*/llvm::None);
+    auto Stat = FS->status(Path);
+    if (!Stat || !Stat->isRegularFile()) {
----------------
kadircet wrote:
> i suppose doing this IO on most of our features are OK, as they tend to search for .clang-format at least (even the ancestor traversal), but I wonder whether these will also be triggered for code completion and signature help, as they currently don't do any IO (apart from preamble deserialization, and maybe one patched include file).
> 
> Not a concern for this patch, but something to keep in mind when integrating with the rest.
Yeah, this is a good point.
One possible design would be to have a "latency-sensitivity" setting, probably on `Params`, and not bothering to stat the file if we're highly sensitive. (i.e. always reuse the cached value unless the cache is uninitialized).

I've added this idea to the class comment.

Related: I think an abstraction like this, with some thought put into invalidation policy, should be eventually used for `.clang-format`, `.clang-tidy`, `compile_commands.json`, as we should be able to get fresh results with a fairly tiny amount of IO.


================
Comment at: clang-tools-extra/clangd/ConfigProvider.cpp:57
+        // For simplicity, don't cache the value in this case (use a bad key).
+        if (Buf->get()->getBufferSize() != Size) {
+          Size = -1;
----------------
kadircet wrote:
> is this really necessary ? we are setting the cache key to old stat values anyway, so the next read shouldn't match neither the Size nor MTime.
I considered doing nothing here, or setting Size = Buf->size.

But there are *possible* scenarios where you go back to the old state (think switching git branches) and then end up with a poisoned cache.

And this is cheap (simple, and also doesn't actually cause you to miss the cache when you want to hit it)


================
Comment at: clang-tools-extra/clangd/ConfigProvider.cpp:67
+    }
+    llvm::copy(CachedValue, std::back_inserter(Out));
+  }
----------------
kadircet wrote:
> nit: I've found the nestedness a bit overwhelming, wdyt about putting this into a scope_exit function and using early returns above ?
scope_exit is a neat idea, but when I tried it I found the code hard to reason about.

I extracted a function instead.

Unnesting this caused me to catch a case i'd missed: if stat succeeds but read fails (once), I'd prefer to retry the read before caching the failure forever... (again, the branch switching workflows)


================
Comment at: clang-tools-extra/clangd/ConfigProvider.cpp:120
+           I != E; ++I) {
+        // Avoid weird non-substring cases like phantom "." components.
+        if (I->end() < Parent.begin() && I->end() > Parent.end())
----------------
kadircet wrote:
> ummm, what ? is this a bug in path iterator that should be fixed rather than worked around?
No, it's documented that Component isn't necessarily part of the path.
The only case I could find of this actually happening is that "/foo/bar/" iterates as "foo", "bar", ".", at least in some cases.

Extended the comment a bit.


================
Comment at: clang-tools-extra/clangd/ConfigProvider.cpp:181
+  for (const auto &Fragment : getFragments(P, DC))
+    Fragment(P, C);
+  return C;
----------------
kadircet wrote:
> nit: I think this looks a bit ... surprising
the fact that this is Fragment instead of Fragment(P, C)?

Yeah, this is an argument for adding an interface rather than using std::function.
But I don't think it's a big deal or really actionable in this patch.


================
Comment at: clang-tools-extra/clangd/ConfigProvider.h:85
+  /// When parsing/compiling, the DiagnosticCallback is used to report errors.
+  /// Usual thread-safety guarantees apply: this function must be threadsafe.
+  virtual std::vector<CompiledFragment>
----------------
kadircet wrote:
> first part of this sentence doesn't say much, maybe drop it?
I think just saying "this function must be threadsafe" is a bit confusing because it's a const method, it's like saying "this function does not throw exceptions" in llvm code.

But I shoud have made my point more explicit. Replaced with "Despite the need for caching..."


================
Comment at: clang-tools-extra/clangd/ConfigProvider.h:91
+/// A provider that reads config from a YAML file in ~/.config/clangd.
+std::unique_ptr<Provider> createFileConfigProvider(const ThreadsafeFS &);
+
----------------
kadircet wrote:
> doesn't seem to be implemented anywhere? I suppose it is just a convenience wrapper around `Provider::fromYAMLFile`, is it worth it?
Whoops, this was copy/pasted from a prototype.
I decided `Provider::fromYAMLFile` was a better name.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82964/new/

https://reviews.llvm.org/D82964





More information about the cfe-commits mailing list