[PATCH] D54269: Introduce shard storage to auto-index.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Nov 12 01:21:53 PST 2018


sammccall added inline comments.


================
Comment at: clangd/index/Background.cpp:252
+
+    auto Hash = FilesToUpdate.lookup(Path);
+    // Put shards into storage for subsequent use.
----------------
nit: i'd suggest doing the writes *after* updating the index, as the latter is user-facing


================
Comment at: clangd/index/Background.cpp:326
       return Error::success();
+    } else if (IndexShardStorage) { // Check if shard storage has the index.
+      auto Shard = IndexShardStorage->retrieveShard(AbsolutePath, Hash);
----------------
I don't think this is the right place to be reading shards, vs on startup. It means we're doing extra IO whenever a file is reindexed, and more importantly we don't make the index contents available on startup (if any file needs reindexing, reads may get stuck behind it).

(as discussed, we can also leave reading out of this patch entirely)


================
Comment at: clangd/index/Background.h:31
 
+// Base class for Shard Storage operations. See DiskShardStorage for more info.
+class ShardStorage {
----------------
nit: prefer the other way around: doc on the interface, the disk implementation can defer to the interface when there's nothing special to say.
That way callers don't have to be confused about what the semantics are in the general case.


================
Comment at: clangd/index/Background.h:35
+  using FileDigest = decltype(llvm::SHA1::hash({}));
+  virtual bool storeShard(llvm::StringRef ShardIdentifier,
+                          IndexFileOut Shard) const = 0;
----------------
nit: we probably don't need "shard" both in the interface name and the method names.

Consider either BackgroundIndexStorage/storeShard or ShardStorage/store


================
Comment at: clangd/index/Background.h:38
+  virtual llvm::Expected<IndexFileIn>
+  retrieveShard(llvm::StringRef ShardIdentifier, FileDigest Hash) const = 0;
+  virtual bool initialize(llvm::StringRef Directory) = 0;
----------------
this signature looks surprising to me, expected something like `Expected<pair<IndexFileIn, FileDigest>> retrieveShard(ShardID).

e.g. if we have a stale shard for a file when initializing, maybe we want to use it until we can reindex the file? Current interface doesn't allow this.

As discussed offline, it'd also be fine to leave retrieve out of the initial patch.


================
Comment at: clangd/index/Background.h:39
+  retrieveShard(llvm::StringRef ShardIdentifier, FileDigest Hash) const = 0;
+  virtual bool initialize(llvm::StringRef Directory) = 0;
+};
----------------
Why not use the constructor? what does "directory" mean in the general case?


================
Comment at: clangd/index/Background.h:51
                   const FileSystemProvider &, ArrayRef<std::string> URISchemes,
+                  std::unique_ptr<ShardStorage> IndexShardStorage = nullptr,
                   size_t ThreadPoolSize = llvm::hardware_concurrency());
----------------
So we have several possible designs on a collision course:
1. One instance of `BackgroundIndex` for clangd, which has a ShardStorage, which is used for every CDB
2. One instance of `BackgroundIndex`, which has many ShardStorages, one-per-CDB
3. Many instances of `BackgroundIndex`, one per CDB, each with a ShardStorage
4. One `BackgroundIndex` for everything, and one ShardStorage for everything (storage in $HOME/.clangd-index)

What are we aiming for here?

The design of ShardStorage in this patch precludes 1, taking unique_ptr here precludes 2. 3 requires us to revisit some aspects of BackgroundIndex (which is possible). With 4 it isn't obvious how we do multi-config, or how users can manage the storage.

(I'd been assuming 1 or 2, but I don't think we ever discussed it)


================
Comment at: clangd/index/Background.h:107
+// thread-safe.
+class DiskShardStorage : public ShardStorage {
+  mutable std::mutex DiskShardRootMu;
----------------
This class can be hidden in the cpp file.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D54269





More information about the cfe-commits mailing list