[PATCH] D54269: Introduce shard storage to auto-index.

Kadir Cetinkaya via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Nov 12 02:33:17 PST 2018


kadircet added inline comments.


================
Comment at: clangd/index/Background.cpp:252
+
+    auto Hash = FilesToUpdate.lookup(Path);
+    // Put shards into storage for subsequent use.
----------------
sammccall wrote:
> nit: i'd suggest doing the writes *after* updating the index, as the latter is user-facing
but updating index consumes slabs. Is it worth making a copy?


================
Comment at: clangd/index/Background.cpp:326
       return Error::success();
+    } else if (IndexShardStorage) { // Check if shard storage has the index.
+      auto Shard = IndexShardStorage->retrieveShard(AbsolutePath, Hash);
----------------
sammccall wrote:
> I don't think this is the right place to be reading shards, vs on startup. It means we're doing extra IO whenever a file is reindexed, and more importantly we don't make the index contents available on startup (if any file needs reindexing, reads may get stuck behind it).
> 
> (as discussed, we can also leave reading out of this patch entirely)
What exactly you mean by startup exactly from BackgroundIndex's perspective. Is it enqueueAll ?


================
Comment at: clangd/index/Background.h:38
+  virtual llvm::Expected<IndexFileIn>
+  retrieveShard(llvm::StringRef ShardIdentifier, FileDigest Hash) const = 0;
+  virtual bool initialize(llvm::StringRef Directory) = 0;
----------------
sammccall wrote:
> this signature looks surprising to me, expected something like `Expected<pair<IndexFileIn, FileDigest>> retrieveShard(ShardID).
> 
> e.g. if we have a stale shard for a file when initializing, maybe we want to use it until we can reindex the file? Current interface doesn't allow this.
> 
> As discussed offline, it'd also be fine to leave retrieve out of the initial patch.
I thought it would be better to get rid of disk io if the file was out-of-date, but using the stale file and deferring indexing also seems like a good idea. Moving with it.


================
Comment at: clangd/index/Background.h:39
+  retrieveShard(llvm::StringRef ShardIdentifier, FileDigest Hash) const = 0;
+  virtual bool initialize(llvm::StringRef Directory) = 0;
+};
----------------
sammccall wrote:
> Why not use the constructor? what does "directory" mean in the general case?
Directory refers to the one specified in CompilationDatabase(which is usually the build directory?), sorry for the inconvenience.
I wasn't sure about where we plan to instantiate BackgroundIndex. If you plan to do that initialization at a point in which we already know the build directory we can move that to constructor, especially only to the constructor of DiskBackedIndexStorage.


================
Comment at: clangd/index/Background.h:51
                   const FileSystemProvider &, ArrayRef<std::string> URISchemes,
+                  std::unique_ptr<ShardStorage> IndexShardStorage = nullptr,
                   size_t ThreadPoolSize = llvm::hardware_concurrency());
----------------
sammccall wrote:
> So we have several possible designs on a collision course:
> 1. One instance of `BackgroundIndex` for clangd, which has a ShardStorage, which is used for every CDB
> 2. One instance of `BackgroundIndex`, which has many ShardStorages, one-per-CDB
> 3. Many instances of `BackgroundIndex`, one per CDB, each with a ShardStorage
> 4. One `BackgroundIndex` for everything, and one ShardStorage for everything (storage in $HOME/.clangd-index)
> 
> What are we aiming for here?
> 
> The design of ShardStorage in this patch precludes 1, taking unique_ptr here precludes 2. 3 requires us to revisit some aspects of BackgroundIndex (which is possible). With 4 it isn't obvious how we do multi-config, or how users can manage the storage.
> 
> (I'd been assuming 1 or 2, but I don't think we ever discussed it)
The aim was more like 2. But I forgot the fact that we could be working with multiple CDBs :D

So it might be better to just pass a factory to BackgroundIndex and let it create one ShardStorage for each CDB(which is distinguished by the build directory?) WDYT?


================
Comment at: clangd/index/Background.h:107
+// thread-safe.
+class DiskShardStorage : public ShardStorage {
+  mutable std::mutex DiskShardRootMu;
----------------
sammccall wrote:
> This class can be hidden in the cpp file.
But it needs to be passed into the constructor of BackgroundIndex, possibly by the ClangdServer, do you suggest hiding inside ClangdServer?


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D54269





More information about the cfe-commits mailing list