[PATCH] D123289: [clangd][SymbolCollector] Introduce a cache for SymbolID generation and some cleanups

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Apr 7 10:54:53 PDT 2022

sammccall accepted this revision.
sammccall added a comment.
This revision is now accepted and ready to land.


I think you've broken the FilesToTokenCache by moving it into the loop, so I'd expect further wins from fixing that.
(If you don't see any, something seems wrong: syntax::tokenize should be called a fair bit and should be pretty expensive!)

Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:589
+    if (Opts.RefsInHeaders || FID == SM.getMainFileID()) {
+      // FIXME: It's better to use TokenBuffer by passing spelled tokens from
+      // the caller of SymbolCollector.
there's a big block of code here that's checking if the reference was spelled or not, pull out a function?

Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:591
+      // the caller of SymbolCollector.
+      llvm::DenseMap<FileID, std::vector<syntax::Token>> FilesToTokensCache;
+      if (!FilesToTokensCache.count(FID))
Um, is this cache meant to be a member? It's pretty well guaranteed to be empty on the next line :-)

Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:710
     // FIXME: Populate container information for macro references.
-    MacroRefs[ID].push_back({Loc, Roles, /*Container=*/nullptr});
+    // FIXME: All macro references are marked as Spelled now, but this should be
+    // checked.
What does a non-spelled macro reference look like?

Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:810
   // At the end of the TU, add 1 to the refcount of all referenced symbols.
   auto IncRef = [this](const SymbolID &ID) {
     if (const auto *S = Symbols.find(ID)) {
no need for this to be a lambda anymore, a plain loop seems fine

Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:814
if you have timing set up, try making this `const_cast<Symbol*>(S)->References++`.

Reinserting into a symbol slab is pretty expensive I think, and this is just an awkward gap in the API.
We could fix the API or live with const_cast if it matters.

Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:964
+SymbolID SymbolCollector::getSymbolIDCached(const Decl *D) {
+  auto It = DeclToIDCache.try_emplace(D, SymbolID{});
+  if (It.second)
nit: just try_emplace(D). The rest of the arguments get forwarded to the V constructor, so you're calling an unneccesary move constructor here.

(Shouldn't matter here because SymbolID is trivial, but it can)

Comment at: clang-tools-extra/clangd/index/SymbolCollector.h:174
   // Symbols referenced from the current TU, flushed on finish().
-  llvm::DenseSet<const NamedDecl *> ReferencedDecls;
-  llvm::DenseSet<const IdentifierInfo *> ReferencedMacros;
-  llvm::DenseMap<const NamedDecl *, std::vector<SymbolRef>> DeclRefs;
-  llvm::DenseMap<SymbolID, std::vector<SymbolRef>> MacroRefs;
+  llvm::DenseSet<SymbolID> ReferencedSymbols;
   // Maps canonical declaration provided by clang to canonical declaration for
hash_code(SymbolID) is not defined in the header, but the hash function is trivial and could be inlined everywhere. Might be worth exposing.

(not really related to this patch, but you have some setup for benchmarking at the moment...)

  rG LLVM Github Monorepo



More information about the cfe-commits mailing list