[PATCH] D50385: [clangd] Collect symbol occurrences from AST.

Thu Aug 9 03:36:41 PDT 2018

ioeric added a comment.

In https://reviews.llvm.org/D50385#1193545, @hokein wrote:

> In https://reviews.llvm.org/D50385#1191914, @ioeric wrote:
>
> > 2 high-level questions:
> >
> > 1. What's the reason for having a separate `SymbolOccurrenceSlab`? Could store occurrences as extra payload of `Symbol`?
>
>
> Storing occurrences in `Symbol` structure is easy to misuse by users IMO -- if we go through this way, we will end up having a `getOccurrences`-like method in `Symbol` structure. Once users get the `Symbol` instance, it is natural for them to call `getOccurrences` to get all occurrences of the symbol. However this `getOccurrences` method doesn't do what users expected (just returning an incomplete set of results or empty). To query the symbol occurrences, we should always use index interface.
>
> Therefore, I think we should try to avoid these confusions in the design.

Hmm, I think this is the same for other symbol payload e.g. definition can be missing for a symbol. And it seems to me that the concern is on the SymbolSlab level: if a slab is for a single TU, users should expect missing information; if a slab is merged from all TUs, then users can expect "complete" information. I think it's reasonable to assume that users of SymbolSlab are aware of this. I think it's probably not worth the overhead of maintaining and using two separate slabs.

>> 2. Could we merge `SymbolOccurrenceCollector` into the existing `SymbolCollector`? They look a lot alike. Having another index data consumer seems like more overhead on the user side.
> 
> The `SymbolOccurrenceCollector` has many responsibilities (collecting declaration, definition, code completion information etc), and the code is growing complex now. Merging the `SymbolOccurrenceCollector` to it will make it more

Although the existing `SymbolCollector` supports different options, I think it still has a pretty well-defined responsibility: gather information about symbols. IMO, cross-reference is one of the property of symbol, and I don't see strong reasons to keep them separated.

> complicated -- we will introduce more option flags like `collect-symbol-only`, `collect-occurrence-only` to configure it for our different use cases (we need to the implementation detail clearly in order to make a correct option for `SymbolCollector`).

I think these options are reasonable if they turn out to be necessary. And making the SymbolCollector more complicated doesn't seem to be a problem if we are indeed doing more complicated work, but I don't think this would turn into a big problem as logic of xrefs seems pretty isolated.  Conversely, I think implementing xrefs in a separate class would likely to cause more duplicate and maintenance, e.g. two sets of options, two sets of initializations or life-time tracking of collectors (they look a lot alike), the same boilerplate factory code in tests, passing around two collectors in user code.

> And I can foresee these two collectors might be run at different point (runWithPreamble vs runWithAST) in dynamic index.

With some options, this should be a problem I think?

Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D50385