[PATCH] D51481: [clangd] Implement proximity path boosting for Dex
Eric Liu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Sep 3 05:32:39 PDT 2018
ioeric added inline comments.
================
Comment at: clang-tools-extra/;:1
+//===--- Token.cpp - Symbol Search primitive --------------------*- C++ -*-===//
+// The LLVM Compiler Infrastructure
----------------
Unintended change?
================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:137
+ BoostingIterators.push_back(
+ createBoost(create(It->second), PARENTS_TO_BOOST + 1 - P.second));
+ }
----------------
ioeric wrote:
> `PARENTS_TO_BOOST + 1 - P.second` seems to be too much boosting. We should align the boosting function with the one we use in clangd/Quality.cpp, e.g. proximity score should be would be in the range [0, 1], and we can boost by `1+proximity`.
The `FIXME` is now outdated?
================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:143
+ float BoostFactor =
+ std::exp(Distance * 0.4f / FileDistanceOptions().UpCost);
+ BoostingIterators.push_back(
----------------
`x` in `std::exp(x)` should be negated so that the range is (0, 1]. And `BoostFactor` should be `1+(0,1]`.
I'm not sure if dividing by UpCost is correct here (and similarly in Quality.cpp), because you would want lower score for larger UpCost?
================
Comment at: clang-tools-extra/clangd/index/dex/Token.cpp:36
+ size_t Limit = 5;
+ while (!Path.empty() && Limit--) {
+ // FIXME(kbobyrev): Parsing and encoding path to URIs is not necessary and
----------------
For the original URI, we could simply add it to the result outside of the loop (and `--Limit`) to save one iteration?
================
Comment at: clang-tools-extra/clangd/index/dex/Token.cpp:54
+ // should be just skipped.
+ if (!static_cast<bool>(URI)) {
+ // Ignore the error.
----------------
nit: just `if (!URI)`
================
Comment at: clang-tools-extra/clangd/index/dex/Token.h:64
+ /// Example: "file:///path/to/clang-tools-extra/clangd/index/SymbolIndex.h"
+ /// and all its parents.
+ PathURI,
----------------
nit: not necessarily *all* parents right?
================
Comment at: clang-tools-extra/clangd/index/dex/Token.h:65
+ /// and all its parents.
+ PathURI,
/// Internal Token type for invalid/special tokens, e.g. empty tokens for
----------------
Maybe just `URI`? Or `ProximityURI`?
================
Comment at: clang-tools-extra/clangd/index/dex/Token.h:96
+/// Should be used within the index build process.
+std::vector<Token> generateProximityPathURIs(llvm::StringRef Path);
+
----------------
nit: s/Path/URIStr/ for the argument.
nit: Just `generateProximityURIs`? Same below.
================
Comment at: clang-tools-extra/unittests/clangd/DexIndexTests.cpp:443
+TEST(DexSearchTokens, QueryProximityDistances) {
+ EXPECT_THAT(generateQueryProximityPathURIs(testRoot() + "/a/b/c/d/e/f/g.h",
+ URISchemes),
----------------
This doesn't seem to work on windows?
================
Comment at: clang-tools-extra/unittests/clangd/DexIndexTests.cpp:620
+TEST(DexIndexTest, ProximityPathsBoosting) {
+ DexIndex I(URISchemes);
----------------
It's unclear what this is testing. Intuitively, you would want a smoke test that checks a symbol with better proximity is ranked higher (when all other signals are the same).
================
Comment at: clang-tools-extra/unittests/clangd/DexIndexTests.cpp:653
+ // FuzzyFind request comes from the file which is far from the root.
+ Req.ProximityPaths.push_back(testRoot() + "/a/b/c/d/e/f/file.h");
+
----------------
again, watch out for windows when hard-coding paths :)
================
Comment at: clang-tools-extra/unittests/clangd/TestFS.cpp:67
-const char *testRoot() {
+std::string testRoot() {
#ifdef _WIN32
----------------
Why this change?
https://reviews.llvm.org/D51481
More information about the cfe-commits
mailing list