[PATCH] D55224: [clangd] Introduce loading of shards within auto-index

Ilya Biryukov via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jan 9 02:38:30 PST 2019


ilya-biryukov added inline comments.


================
Comment at: clangd/index/Background.cpp:259
+      // if this thread sees the older version but finishes later. This should
+      // be rare in practice.
+      DigestIt.first->second = Hash;
----------------
kadircet wrote:
> ilya-biryukov wrote:
> > kadircet wrote:
> > > ilya-biryukov wrote:
> > > > > "should be rare in practice"
> > > > And yet, can we avoid this altogether?
> > > > 
> > > > Also, I believe it won't be rare. When processing multiple different TUs, we can easily run into the same header multiple times, possibly with the different contents.
> > > > E.g. imagine the index for the whole of clang is being built and the user is editing `Sema.h` at the same time. We'll definitely be running into this case over and over again.
> > > Well I am open to ideas, but currently there is no way of knowing whether this is the newer version for the file. Because only information we have is the digest is different than what we currently have and this doesn't yield any chronological information.
> > Do we know which request is "newer" when scheduling it?
> > If so, we could keep the map of the **latest** hashes of the files we've seen (in addition to the `IndexedFileDigests`, which correspond to the digest for which we built the index IIUC).
> > 
> > This would give us a simple invariant of the final state we want to bring the index to: `IndexedFileDigests` should correspond to the latest hashes seen so far. If not, we have to rebuild the index for the corresponding files. That, in turn, gives us a way to resolve conflicts: we should never replace with symbols built for the latest version (hash) of the file we've seen.
> > 
> > Would that work?
> I am not sure if it would work for non-main files. We update the Hash for the main files at each indexing action, so we can safely keep track of the latest versions. But we collect hashes for dependencies while performing the indexing which happens in parallel. Therefore an indexing action triggered earlier might get an up-to-date version of a dependency than an action triggered later-on.
If updates for each TU are atomic (i.e. all files included in the TU are updated in a single go) we would get the up-to-date index eventually, assuming we rebuild the index when TU dependencies change (we don't schedule rebuilds on changes to included header now, but we're planning to do so at some point).


================
Comment at: clangd/index/Background.cpp:468
+    if (!Shard || !Shard->Sources) {
+      vlog("Failed to load shard: {0}", CurDependencyPath);
+      continue;
----------------
kadircet wrote:
> ilya-biryukov wrote:
> > Should we mark the file as requiring re-indexing at this point?
> > Not sure how that might happen in practice, but still...
> All files are marked as requiring re-indexing by default `Dependencies.push_back({AbsolutePath, true})`. The second element is always true, and we only mark it as false if we are sure file is up-to-date.
My confusion is coming from the fact that I'm constantly forgetting that we return the queue we're processing afterwards.
Could you put a small comment that file will be returned to the caller as requiring a reindex here?


Repository:
  rCTE Clang Tools Extra

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55224/new/

https://reviews.llvm.org/D55224





More information about the cfe-commits mailing list