[PATCH] D59481: [clangd] Count number of references while merging RefSlabs inside FileIndex

Ilya Biryukov via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Mar 20 07:45:26 PDT 2019


ilya-biryukov added inline comments.


================
Comment at: clang-tools-extra/clangd/index/FileIndex.cpp:170
           I.first->second = mergeSymbol(I.first->second, Sym);
+        // We re-count number of references while merging refs from scratch.
+        I.first->getSecond().References = 0;
----------------
ioeric wrote:
> kadircet wrote:
> > ioeric wrote:
> > > kadircet wrote:
> > > > ioeric wrote:
> > > > > note that references might not always exist. SymbolCollector can be set up to not collect references.
> > > > I thought we wouldn't make any promises about `References` in such a case, do we?
> > > Is this documented somewhere?
> > > 
> > > `References` and `RefSlab` have been two independent things. `References` existed before `RefSlab`. Now might be a good time to unify them, but I think we should think about how this is reflected in their definitions (documentation etc) and other producers and consumers of the structures.
> > Both of them are only produced by SymbolCollector, which has an option `CountReferences` with a comment saying `// Populate the Symbol.References field.`.
> > Unfortunately that's not what it does, instead it always sets `Symol.References` field to `1` for every symbol collected during indexing of a single TU.
> > Then it is up to the consumers to merge those `1`s. So even though we say:
> > ```
> > /// The number of translation units that reference this symbol from their main
> > /// file. This number is only meaningful if aggregated in an index.```
> > in the comments of `Symbol::References` it is totally up-to-the consumer who performs the merging.
> > 
> > The option controls whether Symbol Occurences should be collected or not.
> > 
> > I see two possible options:
> >  - Un-tie SymbolCollector from Symbol::References and rather define it as:
> > ```
> > /// This field is for the convenience of an aggregated symbol index
> > ```
> >  - Rather change Index structure's(consumers of this information) to store(and build) this information internally if they plan to make use of it ?
> > 
> > WDYT?
> > Unfortunately that's not what it does, instead it always sets Symol.References field to 1 for every symbol collected during indexing of a single TU.
> Looking at the code, I think `Symol.References` is indeed only populated (i.e. set to 1) when `CountReferences` is enabled. Honestly, we should probably get rid of `CountReferences`; I don't see a reason not to count references. 
> 
> But the statement `count one per TU` still stands in SymbolCollector, as it always only handles one TU at a time. 
> 
> Actually, all I'm asking in the initial comment is to define the `References` counting behavior when there is no reference ;) I think a reasonable solution is to keep the `references` untouched when there is no reference, so that `References` aggregation via `mergeSymbol` can still work (for one-TU-per-Slab use cases). 
> 
> And we should probably also document the new behavior e.g. `DuplicateHandling` in FileIndex.h seems to be a good place.
> Honestly, we should probably get rid of CountReferences; I don't see a reason not to count references
We introduced it to avoid getting partial counts in dynamic index. In the absence of the static index, they would skew ranking towards symbols used inside the files you have open, which is not really a good signal in the first place.

I'm not sure if that actually hurts much in practice, though. It could very possibly be the case that the added complexity isn't worth it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59481/new/

https://reviews.llvm.org/D59481





More information about the cfe-commits mailing list