[PATCH] D51481: [clangd] Implement proximity path boosting for Dex

Eric Liu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Sep 5 07:26:36 PDT 2018


ioeric requested changes to this revision.
ioeric added a comment.
This revision now requires changes to proceed.

This should be the last round! ;)



================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:45
+std::vector<std::unique_ptr<Iterator>>
+createPathIterators(llvm::ArrayRef<std::string> ProximityPaths,
+                    llvm::ArrayRef<std::string> URISchemes,
----------------
To be clearer, maybe `createFileProximityIterators`?


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:50
+  // Deduplicate parent URIs extracted from the FuzzyFindRequest.
+  llvm::StringSet<llvm::MallocAllocator> UniqueURIs;
+  // Store all root URIs converted from the original FuzzyFindRequest absolute
----------------
nit: just `llvm::StringSet<>`?

`s/UniqueURIs/ParentURIs/`? As this is a set, we don't need to be explicit about uniqueness here. 

`s/FuzzyFindRequest/ProximityPaths` (in comment). FuzzyFindRequest shouldn't be relevant in this function, and we should avoid mentioning it here. same below.


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:51
+  llvm::StringSet<llvm::MallocAllocator> UniqueURIs;
+  // Store all root URIs converted from the original FuzzyFindRequest absolute
+  // paths within ProximityPath vector.
----------------
nit: maybe `// Use ProximityPaths as proximity sources`?


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:55
+  for (const auto &Path : ProximityPaths) {
+    const auto PathProximityURIs = generateQueryProximityURIs(Path, URISchemes);
+    // Use default distance parameters for the first URI in ProximityURIs, which
----------------
I'd double check `PathProximityURIs` is not empty.


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:62
+  }
+  // Use SymbolRelevanceSignals for symbol quality evaluation: use defaults for
+  // all parameters except for Proximity Path distance signal.
----------------
nit: this is `symbol *relevance* evaluation`


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:77
+      PathProximitySignals.SymbolURI = URI;
+      float BoostFactor = 1 + PathProximitySignals.evaluate();
+      BoostingIterators.push_back(createBoost(create(It->second), BoostFactor));
----------------
`evaluate()` should return an appropriate boosting score (<1 for down-boosting and >1 for up-boosting), so there is no need to plus 1 here.


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:182
+
+  auto Compare = [](IDAndScore LHS, IDAndScore RHS) {
+    return LHS.second > RHS.second;
----------------
`const IDAndScore&`?


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:244
+         "URI.");
+  const auto Scheme = ParsedURI->scheme();
+  const auto Authority = ParsedURI->authority();
----------------
nit: `Scheme` and `Authority` are both used once. I'd just inline them.


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:246
+  const auto Authority = ParsedURI->authority();
+  StringRef Path = ParsedURI->body();
+  // FIXME(kbobyrev): Currently, this is a heuristic which defines the maximum
----------------
nit: I'd call this `Body` for clarity.


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.h:98
+/// Should be used within the index build process.
+std::vector<std::string> generateProximityURIs(llvm::StringRef Path);
+
----------------
nit: s/Path/URIPath/


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.h:98
+/// Should be used within the index build process.
+std::vector<std::string> generateProximityURIs(llvm::StringRef Path);
+
----------------
ioeric wrote:
> nit: s/Path/URIPath/
nit: maybe mention that these functions are exposed for testing only?


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.h:104
+std::vector<std::string>
+generateQueryProximityURIs(llvm::StringRef AbsolutePath,
+                           llvm::ArrayRef<std::string> URISchemes);
----------------
How about changing this one to just convert `AbsolutePath` to a `URI` of the preferred scheme and having the users call `generateProximityURIs`? This would simply the contract. And it can probably be a library function (e.g. `makePreferredURI(AbsPath, URISchemes)`) in URI.h.


================
Comment at: clang-tools-extra/unittests/clangd/DexIndexTests.cpp:430
+  EXPECT_THAT(generateProximityURIs(
+                  "unittest:///clang-tools-extra/clangd/index/Token.h"),
+              ElementsAre("unittest:///clang-tools-extra/clangd/index/Token.h",
----------------
Could you add a case where `generateProximityURIs` generates fewer than 5 proximity URIs? 


================
Comment at: clang-tools-extra/unittests/clangd/DexIndexTests.cpp:601
+
+  Symbol RootSymbol;
+  RootSymbol.Name = Name;
----------------
Could you add a helper (e.g. lambda) for creating the symbol? The code looks almost the same.

Alternatively, this can be:
```
auto Sym1 = symbol("na::X");
Sym1.CanonicalDeclaration.FileURI = "unittest:///...";
auto Sym2 = symbol("nb::X");
Sym1.CanonicalDeclaration.FileURI = "unittest:///..."
```

You can simply spell out the URI paths because they are platform agnostic.


================
Comment at: clang-tools-extra/unittests/clangd/DexIndexTests.cpp:623
+
+  auto I = DexIndex::build(std::move(Builder).build(), URISchemes);
+
----------------
We could use the constructor that doesn't take ownership e.g. `DexIndex({Sym1, Sym2}, URISchemes)`


================
Comment at: clang-tools-extra/unittests/clangd/DexIndexTests.cpp:627
+  Req.Query = "abc";
+  // Return only one element, because the order of callback calls is not
+  // specified: items with lower score can be matched first. Therefore, to check
----------------
nit: or `// The best candidate can change depending on the proximity paths.`?


================
Comment at: clang-tools-extra/unittests/clangd/DexIndexTests.cpp:636
+  llvm::SmallString<30> RequestPath(testRoot());
+  llvm::sys::path::append(RequestPath, "/a/b/c/d/e/f/file.h");
+  Req.ProximityPaths = {RequestPath.str()};
----------------
This is not going to work on windows. You would need something like:
```
append(RequestPath, "a")
append(RequestPath, "b")
```

And you probably don't need a very deep path.


================
Comment at: clang-tools-extra/unittests/clangd/DexIndexTests.cpp:638
+  Req.ProximityPaths = {RequestPath.str()};
+  EXPECT_THAT(matchDeclURIs(*I, Req), ElementsAre(CloseFileURI->toString()));
+
----------------
I think `matchDeclURIs` is not necessary as we could use different qualified names?


================
Comment at: clang-tools-extra/unittests/clangd/TestIndex.h:45
+// otherwise. Returns filename URIs of matched symbols canonical declarations.
+std::vector<std::string> matchDeclURIs(const SymbolIndex &I,
+                                       const FuzzyFindRequest &Req,
----------------
(and this shouldn't be necessary)


https://reviews.llvm.org/D51481





More information about the cfe-commits mailing list