[PATCH] D54269: Introduce shard storage to auto-index.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Nov 14 03:00:31 PST 2018


sammccall added a comment.

Thanks, mostly interface tweaks now.



================
Comment at: clangd/index/Background.cpp:187
+  BackgroundIndexStorage *IndexStorage = getIndexStorage(Directory);
+  if (!IndexStorage)
+    elog("No index storage for: {0}", Directory);
----------------
I think it would be simpler to just disallow the storage factory from returning null, and instead have it return a no-op implementation that logs.

Again, this moves the complexity behind an abstraction, where the rest of the code is simpler.


================
Comment at: clangd/index/Background.h:36
+  // retrieved later on with the same identifier.
+  virtual llvm::Error storeShard(llvm::StringRef ShardIdentifier,
+                                 IndexFileOut Shard) const = 0;
----------------
I like that the implementations treat the shard scheme as opaque, but it'd be good for humans to know that this is going to be a filename in practice.

I'd suggest either changing the param name or adding to the interface comment: "Shards of the index are stored and retrieved independently, keyed by shard identifier - in practice this is a source file name"


================
Comment at: clangd/index/Background.h:39
+
+  virtual llvm::Expected<IndexFileIn>
+  loadShard(llvm::StringRef ShardIdentifier) const = 0;
----------------
docs


================
Comment at: clangd/index/Background.h:39
+
+  virtual llvm::Expected<IndexFileIn>
+  loadShard(llvm::StringRef ShardIdentifier) const = 0;
----------------
sammccall wrote:
> docs
Hmm, we're going to attempt to load the shard corresponding to every file that we have in the CDB, even if the index isn't built yet. So "file doesn't exist" is an expected case (where we don't want to log etc), vs e.g. "file was corrupted" is unexpected and should definitely be logged. How do we distinguish these with this API?

One defensible option would be to have implementations handle errors: return Optional<IndexFileIn>, and have the disk version log errors appropriately and return None.

Another would be Expected<Optional<IndexFileIn>> or so, which is a little messy.


================
Comment at: clangd/index/Background.h:43
+  static std::unique_ptr<BackgroundIndexStorage>
+  createDiskStorage(llvm::StringRef CDBDirectory);
+};
----------------
docs


================
Comment at: clangd/index/Background.h:48
 // all commands in a compilation database. Indexing happens in the background.
+// Takes a factory function to create IndexStorage units for each compilation
+// database. Those databases are identified by directory they are found.
----------------
nit: I'd move the description of IndexStorageFactory semantics to above the Using declaration - generally splitting docs to the most specific decl should make them easier to match up to the code.


================
Comment at: clangd/index/Background.h:49
+// Takes a factory function to create IndexStorage units for each compilation
+// database. Those databases are identified by directory they are found.
 // FIXME: it should also persist its state on disk for fast start.
----------------
by the directory they are found in?


================
Comment at: clangd/index/Background.h:103
+  // Maps CDB Directory to index storage.
+  llvm::StringMap<std::unique_ptr<BackgroundIndexStorage>> IndexStorageMap;
+  IndexStorageFactory IndexStorageCreator;
----------------
I do think having the factory own and cache the storages would be simpler:
- it moves more work out of this complicated class with a lot of responsibilities, and puts it somewhere simpler
- it would mean we can just use a single storage object in tests that don't use multiple CDBs.


================
Comment at: clangd/index/Background.h:103
+  // Maps CDB Directory to index storage.
+  llvm::StringMap<std::unique_ptr<BackgroundIndexStorage>> IndexStorageMap;
+  IndexStorageFactory IndexStorageCreator;
----------------
sammccall wrote:
> I do think having the factory own and cache the storages would be simpler:
> - it moves more work out of this complicated class with a lot of responsibilities, and puts it somewhere simpler
> - it would mean we can just use a single storage object in tests that don't use multiple CDBs.
don't accesses to this map need to be locked?


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D54269





More information about the cfe-commits mailing list