[PATCH] D46943: [clangd] Boost scores for decls from current file in completion

Ilya Biryukov via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed May 23 10:03:59 PDT 2018

ilya-biryukov added a comment.

In https://reviews.llvm.org/D46943#1109199, @ioeric wrote:

> > - Symbols store their paths as URIs ⇒ we need to parse them in order to apply heuristics. We could avoid that by writing a version of header-matching that also works on URIs, but that would mean more complexity.
> Yeah, this is a bit unfortunate. It's probably OK to parse URIs; they are not that expensive after all (we might want to do some measurement though).

Yeah. I really wish we had microbenchmarks for things like completion.

>> - Merging quality signals from headers now requires an extra paramater: name of the source file. I wonder if it's worth extracting builders for symbol qualities into separate classes to keep the current nice signatures, i.e. `merge(Symbol& IndexResult)`.
> I'm not very familiar with `SymbolQualitySignals`. But if we decide to use main file name as a signal, it might make sense to pass it in through the constructor?

That's possible, but that essentially turns `SymbolQualitySignals` from a data class to a stateful builder.

>> - How should we match the header with the main file?  Our options are:
>>   - (proposed by Eric) use main file regex from clang-format for that. I'm not entirely sure it suits us well, since the regex is designed to work on paths inside #include directive, but we're getting ours from the Symbols and Sema AST Decls. Moreover, it means we're gonna read .clang-format files to get that style.
> I think the ".clang-format problem" is not specific to the header matching here. We would eventually need proper format style support in clangd anyway, as clangd provides formatting features (e.g. reformat and include insertion).

Yeah, `IncludeMainRegex` does look like a useful setting from clang-format. And maybe using clang-format settings is a good idea there. I'm just a bit weary of adding this logic in this change along with the completion changes.
So I'd go with a simple heuristic for starters to solve a problem at hand. Happy to improve it to use clang-format regex with a follow-up change if everyone agrees that's a good idea. I mostly feel uneasy about the added complexity to the current change.

Comment at: clangd/MatchingHeader.cpp:44
+  auto HeaderNoExt = llvm::sys::path::filename(Header);
+  return SourceNoExt.equals_lower(HeaderNoExt);
ioeric wrote:
> Why `equals_lower`?
A former-windows-developer habbit. I don't think it hurts in any way if we do `equals_lower` here, we'll merely work in those strange cases where the extensions are not lower-case.
This is also consistent with isHeaderFile/isSourceFile (moved from ClangdServer)

Comment at: clangd/MatchingHeader.h:1
+//===--- MatchingHeader.h - Match source and header files --------*- C++-*-===//
ioeric wrote:
> I wonder if we could merge this into Headers.h
Thanks, I totally forgot we have `Headers.h`.  Will move the helpers there.

Comment at: clangd/MatchingHeader.h:24
+/// header for a \p Source.
+bool isMainHeaderFor(PathRef Header, PathRef Source);
ioeric wrote:
> We might want to briefly explain what a matching header is and what the heuristics are.
My idea was to not elaborate before we agree on what those heuristics should be.
Given the heuristics are so simple and there are suggestions to change them, documenting the current behavior seems like a bad idea. I'd rather keep them a black box for now.
Does that make sense? Maybe add a comment that the heuristics are likely to change, so the users shouldn't rely on them too much?

Comment at: clangd/Quality.cpp:28-29
+struct DeclFiles {
+  bool AllDeclsInMainFile = false;
+  bool HasDeclsInMatchingHeader = false;
ioeric wrote:
> Could we merge these two flags into something like `IsDeclaredInMainHeaderOrFile`?
The two flags are built differently.
**Any** decl in the matching header gives a boost. Otherwise, **all** decls should be in the main file to get a boost.
The second one is built differently for the reasons outlined in the previous comments on why it might not be the best idea to boost completion items that have one of the inside the current file:
- It gives inconsistent ranking for different completion points (before/after declaration)
- The fact that a function has definition in the current file does not necessarily mean I want to call it more often than the others.

Comment at: clangd/Quality.cpp:40
+  assert(MainFile);
+  for (auto *Redecl : D->redecls()) {
+    auto Loc = SM.getSpellingLoc(Redecl->getLocation());
ioeric wrote:
> I wonder if it's still necessary to check all `redecls`. Would it be sufficient to check `D`, if `D` is the decl we referencing to?
I don't think it's sufficient. How could we compute the flags that this function returns by looking at just a single decl?

  rCTE Clang Tools Extra


More information about the cfe-commits mailing list