[PATCH] D55224: [clangd] Introduce loading of shards within auto-index

Kadir Cetinkaya via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jan 8 11:38:35 PST 2019


kadircet added inline comments.


================
Comment at: clangd/index/Background.cpp:259
+      // if this thread sees the older version but finishes later. This should
+      // be rare in practice.
+      DigestIt.first->second = Hash;
----------------
ilya-biryukov wrote:
> kadircet wrote:
> > ilya-biryukov wrote:
> > > > "should be rare in practice"
> > > And yet, can we avoid this altogether?
> > > 
> > > Also, I believe it won't be rare. When processing multiple different TUs, we can easily run into the same header multiple times, possibly with the different contents.
> > > E.g. imagine the index for the whole of clang is being built and the user is editing `Sema.h` at the same time. We'll definitely be running into this case over and over again.
> > Well I am open to ideas, but currently there is no way of knowing whether this is the newer version for the file. Because only information we have is the digest is different than what we currently have and this doesn't yield any chronological information.
> Do we know which request is "newer" when scheduling it?
> If so, we could keep the map of the **latest** hashes of the files we've seen (in addition to the `IndexedFileDigests`, which correspond to the digest for which we built the index IIUC).
> 
> This would give us a simple invariant of the final state we want to bring the index to: `IndexedFileDigests` should correspond to the latest hashes seen so far. If not, we have to rebuild the index for the corresponding files. That, in turn, gives us a way to resolve conflicts: we should never replace with symbols built for the latest version (hash) of the file we've seen.
> 
> Would that work?
I am not sure if it would work for non-main files. We update the Hash for the main files at each indexing action, so we can safely keep track of the latest versions. But we collect hashes for dependencies while performing the indexing which happens in parallel. Therefore an indexing action triggered earlier might get an up-to-date version of a dependency than an action triggered later-on.


================
Comment at: clangd/index/Background.cpp:468
+    if (!Shard || !Shard->Sources) {
+      vlog("Failed to load shard: {0}", CurDependencyPath);
+      continue;
----------------
ilya-biryukov wrote:
> Should we mark the file as requiring re-indexing at this point?
> Not sure how that might happen in practice, but still...
All files are marked as requiring re-indexing by default `Dependencies.push_back({AbsolutePath, true})`. The second element is always true, and we only mark it as false if we are sure file is up-to-date.


================
Comment at: clangd/index/Background.cpp:475
+        if (auto AbsolutePath = URI::resolve(*U, CurDependencyPath)) {
+          if (*AbsolutePath != CurDependencyPath) {
+            if (InQueue.try_emplace(*AbsolutePath).second)
----------------
ilya-biryukov wrote:
> Why do we need an extra check for self-edges here? Shouldn't `InQueue.try_emplace` handle this too?
It is not looking for a self-edge. It is trying to find source information related to current file(which contains symbols, refs and hash). The nesting seems to be confusing it was just to prevent one redundant try_emplace seems like making the code less readable taking it out.


================
Comment at: clangd/index/Background.h:108
   BackgroundIndexStorage::Factory IndexStorageFactory;
+  // Loads the shards for all the sources and headers of the Cmd. Returns the
+  // list of dependencies for this Cmd and whether they need to be re-indexed.
----------------
ilya-biryukov wrote:
> Could you elaborate what are the "sources and headers of the Cmd"? The compile command is typically created for a **single** source file, so this comment looks a bit confusing.
It seems like comment become out-dated marking as done.


================
Comment at: clangd/index/Background.h:121
+      std::vector<std::pair<tooling::CompileCommand, BackgroundIndexStorage *>>
+          &NeedsReIndexing);
+  void enqueue(tooling::CompileCommand Cmd, BackgroundIndexStorage *Storage);
----------------
ilya-biryukov wrote:
> Maybe return `Expected<vector<..>>` instead of using an out parameter?
Actually this function can no longer fail, therefore directly returning the vector


Repository:
  rCTE Clang Tools Extra

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55224/new/

https://reviews.llvm.org/D55224





More information about the cfe-commits mailing list