[PATCH] D77732: [clangd] Shard preamble symbols in dynamic index

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Apr 9 14:43:27 PDT 2020


sammccall added a comment.

Can't believe we didn't do this before. Nice catch.



================
Comment at: clang-tools-extra/clangd/index/FileIndex.cpp:97
+/// paths used by \p FileSyms.
+void shardSlabToFiles(const SymbolSlab &Syms, const RelationSlab &Rels,
+                      FileSymbols &FileSyms, llvm::StringRef HintPath) {
----------------
This screams out to be shared with BackgroundIndex::update(). Is it very hard to do?

Things that are different:
 - no duplication of symbol into defining file (why?)
 - refs not gathered here (but I guess empty input -> empty output?)
 - include graph not gathered here (same)
 - this holds 3 copies of all the data at peak, other impl holds 2

I imagine factoring out a function that returns a StringMap<File> like in BackgroundIndex::Update, but File has a build() function that returns an IndexFileIn or so.
(Ooh, is depending on IndexFileIn a layering violation? Maybe we should move that into another header. Duplicating the struct is probably OK for now)


================
Comment at: clang-tools-extra/clangd/index/FileIndex.cpp:112
+      auto URIPath = URI::resolve(FileURI, HintPath);
+      assert(URIPath);
+      I.first->second = *URIPath;
----------------
um... I guess that works, because Expected is unchecked in NDEBUG?
I think cantFail is clearer though.


================
Comment at: clang-tools-extra/clangd/index/FileIndex.cpp:331
       MainFileSymbols.buildIndex(IndexType::Light, DuplicateHandling::Merge));
+  vlog("Build dynamic index for mainfile symbols with estimated memory usage "
+       "of {0} bytes",
----------------
nit: main-file


================
Comment at: clang-tools-extra/clangd/index/FileIndex.h:119
   //
-  // FIXME: Because the preambles for different TUs have large overlap and
-  // FileIndex doesn't deduplicate, this uses lots of extra RAM.
-  // The biggest obstacle in fixing this: the obvious approach of partitioning
-  // by declaring file (rather than main file) fails if headers provide
-  // different symbols based on preprocessor state.
+  // FIXME: We ignore symbol resulting from different PP states while parsing a
+  // file and store only the last one.
----------------
This seems like a caveat rather than a FIXME - we're not really proposing a fix are we?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77732/new/

https://reviews.llvm.org/D77732





More information about the cfe-commits mailing list