[PATCH] D40548: [clangd] Symbol index interfaces and index-based code completion.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Dec 13 04:27:49 PST 2017


sammccall added a comment.

I think this patch is too big, and there are several separate pieces here:

- the index interface, and the implementation of the dynamic symbol index (these probably belong together). We're missing unit tests for the implementation.
- building and updating the dynamic symbol index in clangd
- code completion based on a symbol index

As a consequence I'm finding it pretty intimidating to review, I've tried to leave the main high-level comments.



================
Comment at: clangd/ClangdIndex.h:24
+
+/// \brief This manages symbols from source files that can be updated or
+/// removed. Symbols are used by a symbol index which builds indexes on symbols
----------------
nit: maybe call this out as a container? that tells a lot about what it does/doesn't do.

e.g. `\brief A container of Symbols from several source files. It can be updated at source-file granularity, replacing all symbols from one file with a new set.` ...


================
Comment at: clangd/ClangdUnit.cpp:854
                          NewAST->getDiagnostics().end());
+      if (That->FileSyms) {
+        auto Symbols =
----------------
Having clangdunit know how the index is maintained feels like a layering violation.

Can we instead have clangdunit take a callback (`UniqueFunction<void(ParsedAST&)>` or so) that gets invoked after each successful build? That can be the "hook" where we do indexing.

The common higher layer (ClangdServer) then just has to have the glue code tying the ClangdUnit API to the Index functionality.


================
Comment at: clangd/ClangdUnitStore.h:29
 public:
+  explicit CppFileCollection(FileSymbols *FileSyms) : FileSyms(FileSyms) {}
+
----------------
Similarly here, this can take a post-build callback of void(Filename, ParsedAST&) that is translated into per-cppfile callbacks. Then it doesn't need to know about FileSymbols.

(Or ParsedAST*, with nullptr meaning this file is gone)


================
Comment at: clangd/CodeComplete.h:50
 
+  /// Add symbols in namespace context (including global namespace) to code
+  /// completion.
----------------
I find the config changes pretty confusing.
CodeComplete options is for options we want to offer users. This change doesn't seem motivated by a user requirement, but rather by implementation details.
If we're going to do that, we should be clear about it, rather than just expose every option of clang::CodeCompletionOptions we might want to set ourselves

If we do want to ensure that we can do clangd::CodeCompleteOptions -> clang::CodeCompleteOptions without extra args:

    CodeCompleteOptions {
      ...
      const Index* Index = nullptr; // Populated internally by clangd, do not set
    }

then we can use the presence of index to determine whether to restrict what we ask Sema for.
(We could also let users override the index used for code completion this way if we want, but I don't see a use case)


================
Comment at: clangd/index/FileIndex.h:62
+/// whose symbols can be easily managed in memory.
+class FileSymbolsMemIndex : public SymbolIndex {
+public:
----------------
I think it'd be possible and nice to decouple `MemIndex` from FileSymbols, such that we can use it for the on-disk index (which is one big slab).

At the moment it seems like an unhappy mix - we don't actually get any incrementality, but we're coupled to the incremental storage.

I think it's worth addressing now as MemIndex should be named differently and move to another file if this is the case.

One way this could work:
  class MemIndex {
    shared_ptr<vector<const Symbol*>> Symbols;

    // Symbols must remain accessible as long as S is kept alive.
    build(shared_ptr<vector<const Symbol*>> S) {
      // TODO: build smarter index structures later
      lock_guard
      Symbols = move(S); // releases old symbols
      // TODO: move smart index structures
    }
  }

  class FileSymbols {
    // The shared_ptr keeps the symbols alive
    shared_ptr<vector<const Symbol *>> allSymbols() {
      struct Snapshot {
        vector<const Symbol*> Pointers;
        vector<shared_ptr<SymbolSlab>> KeepAlive;
      };
      auto Snap = make_shared<Snapshot>();
      for (Slab in FileToSymbols) {
        Snap->KeepAlive.push_back(Slab);
        for (Sym in *Slab)
          Snap.Pointers.push_back(&Sym);
      }
      return shared_ptr<vector<const Symbol *>>(move(Snap), &Snap->Pointers);
    }
  }

  // In ClangdServer, when getting new symbols for a file
  FileSymbols.update(Path, Collector.takeSymbols());
  DynamicIndex.build(FileSymbols.allSymbols());

This seems pretty nice, but vector<shared_ptr<SymbolSlab>> would work in practice for the interface too, if you don't like aliasing shared_ptr tricks.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D40548





More information about the cfe-commits mailing list