[PATCH] D48441: [clangd] Incorporate transitive #includes into code complete proximity scoring.
Eric Liu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Jun 28 04:53:09 PDT 2018
ioeric added inline comments.
================
Comment at: clangd/FileDistance.cpp:9
+//===----------------------------------------------------------------------===//
+//
+//
----------------
Is this intentionally reserved?
================
Comment at: clangd/FileDistance.cpp:51
+ for (StringRef Rest = Canonical; !Rest.empty();) {
+ Rest = parent_path(Rest, sys::path::Style::posix);
+ auto NextHash = hash_value(Rest);
----------------
nit: this probably deserve a comment or a better formatting. It looks like part of the loop incremental.
================
Comment at: clangd/FileDistance.cpp:86
+ auto Canonical = canonicalize(Path);
+ unsigned Cost = kUnreachable;
+ SmallVector<hash_code, 16> Ancestors;
----------------
Should we also check the cache here?
================
Comment at: unittests/clangd/FileDistanceTests.cpp:36
+ // Ancestor (up+up+up+up)
+ EXPECT_EQ(D.distance("/"), 22u);
+ // Sibling (up+down)
----------------
sammccall wrote:
> ioeric wrote:
> > I think this is an interesting case. IIUC, 22u is contributed by 4 ups (4*5) and 1 include (1*2, via `llvm/ADT/StringRef.h`).
> >
> > Suppose `a/b/c/d/e/f.cc` includes `base/x.h`, then the shortest path into directory `other/` is likely to go via `base/x.h`. This might become a problem if `base/x.h` is very commonly used. One solution is probably to penalize edits that hit the root.
> Yeah, this could be a weakness. I don't think this is specific to #includes, or to root directories.
> I think the generalization is that there exist directories (like /usr/include) whose siblings are unrelated. Down-traversals in these directories should be more expensive. I've added a Caveats section to the FileDistance documentation, but I would like to avoid adding special cases until we can see how much difference they make.
As chatted offline, this could be still concerning as it's not uncommon for files to include headers from top-level directories. I think the idea you proposed - restricting up traversals from included files, could address the problem.
Repository:
rCTE Clang Tools Extra
https://reviews.llvm.org/D48441
More information about the cfe-commits
mailing list