[PATCH] D51279: [clangd] Implement findOccurrences interface in dynamic index.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Aug 28 02:36:31 PDT 2018


sammccall added a comment.

In https://reviews.llvm.org/D51279#1214268, @ilya-biryukov wrote:

> Just noticed I'm not on the reviewers list, sorry for chiming in without invitation. Was really excited about the change :-)


fixed :-)



================
Comment at: clangd/index/FileIndex.cpp:58
 
-  return Collector.takeSymbols();
+  vlog("index for AST: ");
+  auto Syms = Collector.takeSymbols();
----------------
combine into one log statement


================
Comment at: clangd/index/FileIndex.h:50
 
+  std::vector<std::shared_ptr<SymbolOccurrenceSlab>> allOccurrenceSlabs() const;
+
----------------
This return type seems inconsistent with the design of this class.
The purpose of the `shared_ptr<vector<symbol_slab>>` in `allSymbols` is to avoid exposing the concrete storage.
The analogue would I guess be `shared_ptr<DenseMap<SymbolID, vector<SymbolOccurrence*>>>`.

The cost of creating that does seem a little gross though.


================
Comment at: clangd/index/FileIndex.h:50
 
+  std::vector<std::shared_ptr<SymbolOccurrenceSlab>> allOccurrenceSlabs() const;
+
----------------
sammccall wrote:
> This return type seems inconsistent with the design of this class.
> The purpose of the `shared_ptr<vector<symbol_slab>>` in `allSymbols` is to avoid exposing the concrete storage.
> The analogue would I guess be `shared_ptr<DenseMap<SymbolID, vector<SymbolOccurrence*>>>`.
> 
> The cost of creating that does seem a little gross though.
allOccurrences
(name should reflect the semantics, not the type)


================
Comment at: clangd/index/FileIndex.h:57
   llvm::StringMap<std::shared_ptr<SymbolSlab>> FileToSlabs;
+  /// \breif Stores the latest occurrence slabs for all active files.
+  llvm::StringMap<std::shared_ptr<SymbolOccurrenceSlab>> FileToOccurrenceSlabs;
----------------
nit: no need for brief unless the "first sentence" heuristic fails. (also, it's misspelled!)




================
Comment at: clangd/index/FileIndex.h:95
 
 /// Retrieves namespace and class level symbols in \p AST.
 /// Exposed to assist in unit tests.
----------------
update doc


================
Comment at: clangd/index/MemIndex.cpp:36
   auto Idx = llvm::make_unique<MemIndex>();
-  Idx->build(getSymbolsFromSlab(std::move(Slab)));
+  Idx->build(getSymbolsFromSlab(std::move(Slab)), {});
   return std::move(Idx);
----------------
we should find a way to avoid having functions implicitly zero out parts of the index like this!


================
Comment at: clangd/index/MemIndex.cpp:86
     const OccurrencesRequest &Req,
     llvm::function_ref<void(const SymbolOccurrence &)> Callback) const {
+  for (const auto &Slab : OccurrenceSlabs) {
----------------
This assumes there is no duplication across occurrences, but gives no mechanism through which deduplication could occur (because we're coupled to the underlying storage).

If FileIndex provides the DenseMap<SymbolID, vector<SymbolOccurrence*>> then it could just refrain from providing duplicate pointers.


================
Comment at: clangd/index/MemIndex.h:27
+  build(std::shared_ptr<std::vector<const Symbol *>> Symbols,
+        std::vector<std::shared_ptr<SymbolOccurrenceSlab>> OccurrenceSlabs = {});
 
----------------
ilya-biryukov wrote:
> Maybe remove the default parameter? Making the callers specify the occurrences explicitly could make their code more straight-forward to follow.
(type and name: same comment as on FileIndex)


================
Comment at: clangd/index/Merge.cpp:84
                            Callback) const override {
-    log("findOccurrences is not implemented.");
+    // Store all seen files by dynamic index.
+    llvm::DenseSet<llvm::StringRef> SeenFiles;
----------------
I'm tempted to say remove this heuristic for now (as you say, we should solve this thoroughly), but the artifacts will be severe, so it's probably correct to include it.

But this comment doesn't explain the heuristic, it just echoes the code.
I'd suggest something like:
```// We don't want duplicate occurrences from the static/dynamic indexes,
// and we can't reliably deduplicate because offsets may differ slightly.
// We consider the dynamic index authoritative and report all its occurrences,
// and only report static index occurrences from other files.
// FIXME: this heuristic fails if the dynamic index contains a file, but all
// occurrences were removed (we will report stale ones from static).
// Ultimately we should explicitly check which index has the file instead.```




================
Comment at: clangd/index/Merge.cpp:85
+    // Store all seen files by dynamic index.
+    llvm::DenseSet<llvm::StringRef> SeenFiles;
+    // Emit all dynamic occurrences, and static occurrences that are not
----------------
SeenFiles -> DynamicIndexFiles?


================
Comment at: clangd/index/Merge.cpp:88
+    // in seen files.
+    // FIXME: If a file has been updated, and there are no occurrences indexed
+    // in dynamic index, stale results are still returned (from static index).
----------------
hokein wrote:
> ilya-biryukov wrote:
> > Do we want to fix it before checking this in? As discussed offline, this seems to be a limitation that could affect the design (FileOccurrences vs SymbolOccurrence in the symbol interface). In case we want to avoid refactoring the interfaces later.
> Yes, I re-thought it afterwards. We could solve this issue for `findOccurrences` by redefining the interface.
> 
> Unfortunately, we have the same issue for `fuzzyFind`, `lookup` too -- we still return the stale symbols from static index if the symbol was deleted in dirty files. I think we should address this issue thoroughly and separately -- we probably want to add a method in dynamic index for query whether a file is in.
Yes. Please document this limitation on the MergedIndex class too since it applies to everything. It should still be documented here because it's a limitation of the particular heuristic we're using here.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D51279





More information about the cfe-commits mailing list