[PATCH] D48441: [clangd] Incorporate transitive #includes into code complete proximity scoring.
Eric Liu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Jun 22 03:32:24 PDT 2018
ioeric added a comment.
Looks great! Thanks for doing this!
The algorithm looks pretty efficient. It might still make sense to collect some performance numbers for real files with potentially large #include tree (we have plenty of those in our internal codebase :)
================
Comment at: clangd/CodeComplete.cpp:908
+ llvm::Optional<IncludeInserter> Inserter; // Available during runWithSema.
+ llvm::Optional<URIDistance> FileProximity; // Initialized once Sema runs.
----------------
It might make sense to briefly explain somewhere how computational cost is distributed across the workflow.
================
Comment at: clangd/CodeComplete.cpp:938
+ SemaCCInput.FileName, SemaCCInput.Contents,
+ format::getLLVMStyle(), // XXX
+ SemaCCInput.Command.Directory,
----------------
nit: `Style`?
================
Comment at: clangd/FileDistance.cpp:72
+ // Fill in all ancestors between the query and the known.
+ for (unsigned I = 0; I < Ancestors.size(); ++I) {
+ if (Cost != kUnreachable)
----------------
nit: this loop would probably be easier to understand if iterated reversely.
================
Comment at: clangd/FileDistance.cpp:86
+ return R.first->getSecond();
+ if (auto U = clangd::URI::parse(URI)) {
+ LLVM_DEBUG(dbgs() << "distance(" << URI << ") = distance(" << U->body()
----------------
This is done for every index symbol, so I wonder if we could avoid parsing URIs by assuming some URI structures.
================
Comment at: clangd/FileDistance.h:56
+
+ FileDistance(llvm::StringMap<unsigned> Roots,
+ const FileDistanceOptions &Opts = {});
----------------
nit: It's unclear what the values of Roots are (it can be confused with tree roots or directory roots).
================
Comment at: clangd/Headers.cpp:28
+ : SM(SM), Out(Out) {
+ SM.getMainFileID();
+ }
----------------
What does this do?
================
Comment at: clangd/Headers.h:62
+ std::vector<Inclusion> MainFileIncludes;
+ llvm::StringMap<unsigned> includeDepth(llvm::StringRef MainFileName) const;
+
----------------
Does this return all transitive includes in the entire TU or just those from `MainFileName`? And what's `MainFileName` here? Is it the same as the main file in a TU?
================
Comment at: clangd/Headers.h:67
+ llvm::StringRef IncludedName,
+ llvm::StringRef IncludedRealName);
+
----------------
The real name can be empty. How do we handle empty path?
================
Comment at: clangd/Headers.h:75
+ // The paths we want to expose are the RealPathName, so store those too.
+ unsigned fileNumber(llvm::StringRef Name);
+ llvm::StringMap<unsigned> NameToIndex; // Values are file numbers.
----------------
Maybe `fileIndex`?
================
Comment at: clangd/Quality.cpp:233
+std::pair<float, unsigned> proximityScore(llvm::StringRef SymbolURI,
+ URIDistance *D) {
----------------
nit: `static`
================
Comment at: unittests/clangd/FileDistanceTests.cpp:36
+ // Ancestor (up+up+up+up)
+ EXPECT_EQ(D.distance("/"), 22u);
+ // Sibling (up+down)
----------------
I think this is an interesting case. IIUC, 22u is contributed by 4 ups (4*5) and 1 include (1*2, via `llvm/ADT/StringRef.h`).
Suppose `a/b/c/d/e/f.cc` includes `base/x.h`, then the shortest path into directory `other/` is likely to go via `base/x.h`. This might become a problem if `base/x.h` is very commonly used. One solution is probably to penalize edits that hit the root.
Repository:
rCTE Clang Tools Extra
https://reviews.llvm.org/D48441
More information about the cfe-commits
mailing list