[PATCH] D59481: [clangd] Count number of references while merging RefSlabs inside FileIndex

Eric Liu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Mar 20 06:21:28 PDT 2019


ioeric added inline comments.


================
Comment at: clang-tools-extra/clangd/index/FileIndex.cpp:33
+// If MergedSyms is provided, increments a symbols Reference count once for
+// every file it was mentioned in.
+void mergeRefs(llvm::ArrayRef<std::shared_ptr<RefSlab>> RefSlabs,
----------------
nit: s/every file/every slab/?

There is no "file" in this context.


================
Comment at: clang-tools-extra/clangd/index/FileIndex.cpp:46
+        auto It = MergedSyms->find(Sym.first);
+        assert(It != MergedSyms->end() && "Reference to unknown symbol!");
+        // Note that we increment References per each file mentioning the
----------------
I think we can trigger this assertion with an incomplete background index. For example, we've seen references of a symbol in some files but we've not parsed the file that contains the decls.


================
Comment at: clang-tools-extra/clangd/index/FileIndex.cpp:50
+        // FileIndex keeps only oneslab per file.
+        // This contradicts with behavior inside clangd-indexer, it only counts
+        // translation units since it processes each header multiple times.
----------------
kadircet wrote:
> ioeric wrote:
> > this is a bit of a code smell. FileIndex is not supposed to know anything about clangd-indexer etc.
> I've actually just left the comment for review so that we can decide what to do about discrepancy(with my reasoning). It is not like FileIndex is somehow tied to clangd-indexer now?
the comment is useful. I just think it's in the wrong place. If we define the reference counting behavior for `FileSymbols`, `FileSymbols`wouldn't need to know about clangd-indexer or background-index. This comment can then live in background index instead.


================
Comment at: clang-tools-extra/clangd/index/FileIndex.cpp:170
           I.first->second = mergeSymbol(I.first->second, Sym);
+        // We re-count number of references while merging refs from scratch.
+        I.first->getSecond().References = 0;
----------------
kadircet wrote:
> ioeric wrote:
> > kadircet wrote:
> > > ioeric wrote:
> > > > note that references might not always exist. SymbolCollector can be set up to not collect references.
> > > I thought we wouldn't make any promises about `References` in such a case, do we?
> > Is this documented somewhere?
> > 
> > `References` and `RefSlab` have been two independent things. `References` existed before `RefSlab`. Now might be a good time to unify them, but I think we should think about how this is reflected in their definitions (documentation etc) and other producers and consumers of the structures.
> Both of them are only produced by SymbolCollector, which has an option `CountReferences` with a comment saying `// Populate the Symbol.References field.`.
> Unfortunately that's not what it does, instead it always sets `Symol.References` field to `1` for every symbol collected during indexing of a single TU.
> Then it is up to the consumers to merge those `1`s. So even though we say:
> ```
> /// The number of translation units that reference this symbol from their main
> /// file. This number is only meaningful if aggregated in an index.```
> in the comments of `Symbol::References` it is totally up-to-the consumer who performs the merging.
> 
> The option controls whether Symbol Occurences should be collected or not.
> 
> I see two possible options:
>  - Un-tie SymbolCollector from Symbol::References and rather define it as:
> ```
> /// This field is for the convenience of an aggregated symbol index
> ```
>  - Rather change Index structure's(consumers of this information) to store(and build) this information internally if they plan to make use of it ?
> 
> WDYT?
> Unfortunately that's not what it does, instead it always sets Symol.References field to 1 for every symbol collected during indexing of a single TU.
Looking at the code, I think `Symol.References` is indeed only populated (i.e. set to 1) when `CountReferences` is enabled. Honestly, we should probably get rid of `CountReferences`; I don't see a reason not to count references. 

But the statement `count one per TU` still stands in SymbolCollector, as it always only handles one TU at a time. 

Actually, all I'm asking in the initial comment is to define the `References` counting behavior when there is no reference ;) I think a reasonable solution is to keep the `references` untouched when there is no reference, so that `References` aggregation via `mergeSymbol` can still work (for one-TU-per-Slab use cases). 

And we should probably also document the new behavior e.g. `DuplicateHandling` in FileIndex.h seems to be a good place.


================
Comment at: clang-tools-extra/clangd/index/FileIndex.cpp:180
     }
     // FIXME: aggregate symbol reference count based on references.
     break;
----------------
this FIXME can be removed?


================
Comment at: clang-tools-extra/clangd/index/FileIndex.cpp:189
           AllSymbols.push_back(&Sym);
+    mergeRefs(RefSlabs, RefsStorage, AllRefs);
     break;
----------------
any reason not to count references for `PickOne`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59481/new/

https://reviews.llvm.org/D59481





More information about the cfe-commits mailing list