[PATCH] D85426: [clangd] Implement FileFilter for the indexer

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Aug 7 02:55:13 PDT 2020


sammccall added a comment.

In D85426#2199501 <https://reviews.llvm.org/D85426#2199501>, @kbobyrev wrote:

> @sammccall This is the change I was talking about during the standup: the code looks legit, but `BackgroundIndexTests.IndexTwoFiles` fails because with the following code path in `Background.cpp`'s `FileFilter`:

What location is invalid, and why?



================
Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:643
   assert(Loc.isValid() && "Invalid source location for NamedDecl");
-  // FIXME: use the result to filter out symbols.
-  shouldIndexFile(SM.getFileID(Loc));
+  if (!shouldIndexFile(SM.getFileID(Loc)))
+    return nullptr;
----------------
kbobyrev wrote:
> hokein wrote:
> > A drive-by comment from D84811: the file granularity vs symbol granularity is tricky here.
> > 
> > Note that a *full* symbol (with declaration, definition, etc) may be formed from different files (.h, .cc), thinking of a following case:
> > 
> > ```
> > // foo.h
> > void func();
> > 
> > // user.cc
> > #include "foo.h"
> > 
> > // foo.cc
> > #include "foo.h"
> > void func() {}
> > ```
> > 
> > if our indexer indexes `user.cc` first, then `foo.h` is considered indexed, later when indexing `foo.cc`, we will skip the `func` symbol. so the symbol `foo` will not have definition.
> >  
> > 
> Oh, you're right, good catch! That's why the post-filtering would probably work but maybe won't be as fancy :(
> 
> Thank you for mentioning it!
This is indeed happening in the wrong place, I think, but postfiltering isn't the only solution.

The FileFilter refers to what files you want to index, not which symbols. So I think you want to make the decision in handle*Occurrence, based on the location of the occurrence, not the location of the declaration it refers to.

By the time you get to addDefinition/addDeclaration it really is too late, the caller has already decided that we want to index this symbol and which declaration should be used. (BTW, weakening addDeclaration to return nullptr sometimes seems wrong for the same reason).

This is all very confusing, and we should rewrite the indexer to make it more explicit how data is combined, and less stateful. (More like a mapreduce)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85426/new/

https://reviews.llvm.org/D85426



More information about the cfe-commits mailing list