[PATCH] D40548: [clangd] Symbol index interfaces and index-based code completion.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Dec 7 09:33:13 PST 2017


sammccall added inline comments.


================
Comment at: clangd/ClangdIndex.h:24
+/// (re-)scoring symbols from different indexes.
+class CombinedSymbolIndex : public SymbolIndex {
+public:
----------------
This seems a little premature to me - it's hard to know if this idea makes sense without actually having multiple implementations to mix together. My gut feeling is that sorting by (index weight, symbol score) puts too much emphasis on the weight.

Can we just have a single optional index for now?


================
Comment at: clangd/ClangdIndex.h:36
+
+  void addSymbolIndex(llvm::StringRef Label, WeightedIndex Index);
+
----------------
If you do keep this class, move the label inside the struct? Avoids a lot of std::pair


================
Comment at: clangd/ClangdIndex.h:50
+// relatively small symbol table built offline.
+class InMemoryIndexSourcer {
+public:
----------------
As discussed offline - the lifetime of the contained symbols and the operations on the index derived from it need to be carefully considered.

Involving inheritance here seems likely to make these tricky interactions even trickier.

Would suggest:
 - a concrete class (`SymbolCache`? `FileSymbols`?) to hold a collection of symbols from multiple files, with the ability to iterate over them and replace all the symbols from one file at once
 - a concrete class `MemIndex` that can be constructed from a sequence of symbols and implements `Index`

You probably want to make them both immutable with snapshot semantics, or have a reader-writer lock that spans both.


================
Comment at: clangd/ClangdIndex.h:72
+   /// collected.
+   void update(PathRef Path, ASTContext &Context,
+               std::shared_ptr<Preprocessor> PP,
----------------
The AST -> symbol conversion doesn't seem to have much to do with the rest of the class - we can move this to a free function I think, which would give the class a narrower responsibility.


================
Comment at: clangd/ClangdIndex.h:1
+//===--- ClangdIndex.h - Symbol indexes for clangd.---------------*- C++-*-===//
+//
----------------
nit: `Clangd` prefix doesn't do much.
I'd suggest `Index.h` for the interface (including Symbol), and `MemIndex.h` for the in-memory implementations?


================
Comment at: clangd/ClangdLSPServer.h:37
+      llvm::Optional<Path> CompileCommandsDir,
+      std::vector<
+          std::pair<llvm::StringRef, CombinedSymbolIndex::WeightedIndex>>
----------------
ilya-biryukov wrote:
> Maybe pass a parameter of type `SymbolIndex&` instead of a vector, which is used to create `CombinedSymbolIndex` later?
> It seems that `ClangdServer` does not really care about the specific index implementation that's used (nor should it!)
> 
> We could have helper methods to conveniently create `CombinedSymbolIndex` from a list of indexes, or even create the default index for clangd. 
I think the issue is that ClangdServer is going to create and maintain the dynamic index, which should then be merged as a peer with a set of other indexes.

Can we just pass in a single SymbolIndex* StaticIdx, and hard-code the assumption about static index + dynamic index being what we're using? That way we can also hard-code the heuristics for merging them together, instead of data-driving everything.


================
Comment at: clangd/ClangdLSPServer.h:38
+      llvm::Optional<Path> CompileCommandsDir,
+      bool EnableIndexBasedCodeCompletion,
+      std::vector<
----------------
this is kind of two options in one - "should we build the index" and "should we query the index".

That seems OK, but I'd consider renaming this to BuildDynamicSymbolIndex or something to make it clear that building is optional, querying it isn't provided as an option


================
Comment at: clangd/CodeComplete.cpp:378
+        // FIXME: output a warning to logger if there are results from sema.
+        return qualifiedIdCompletionWithIndex(*Index, S, **OptSS, &Items);
+      }
----------------
I don't like the idea of doing index lookups from the middle of a sema callback!

Can we just record the CCContext  in the Collector instead, and do this work afterwards?


================
Comment at: clangd/CodeComplete.h:71
 /// Get code completions at a specified \p Pos in \p FileName.
+/// If `Index` is not nullptr, this will use the index for global code
+/// completion; otherwise, use sema code completion by default.
----------------
I'm not sure you want to spell out this behavior - merging vs replacing is more of an implementation detail


================
Comment at: clangd/SymbolIndex.h:45
+
+class SymbolIndex {
+public:
----------------
Interfaces need doc :-)


================
Comment at: clangd/SymbolIndex.h:49
+
+  virtual llvm::Expected<CompletionResult>
+  complete(const CompletionRequest &Req) const = 0;
----------------
As discussed, we may want a callback-based API that makes it easier for an implementation to manage the lifetime of symbols without copying them.

Also is there anything to do with errors other than log and treat as no results? If not, just do that instead of returning Expected.


================
Comment at: clangd/SymbolIndex.h:50
+  virtual llvm::Expected<CompletionResult>
+  complete(const CompletionRequest &Req) const = 0;
+
----------------
This is finding symbols that fuzzy-match a string, right?
We shouldn't conflate this with code completion - code completion is context-sensitive, and this query operation will be used for other operations like navigate-to-symbol.

Suggest `fuzzyFind` or similar.
(Similarly with CompletionRequest, CompletionSymbol)


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D40548





More information about the cfe-commits mailing list