[PATCH] D59300: [clangd] Tune the fuzzy-matching algorithm
Ilya Biryukov via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Mar 15 04:13:04 PDT 2019
ilya-biryukov added inline comments.
Comment at: clang-tools-extra/clangd/FuzzyMatch.cpp:285
+ if (Pat[P] == Word[W] ||
+ (WordRole[W] == Head && (IsPatSingleCase || PatRole[P] == Head)))
> ilya-biryukov wrote:
> > ioeric wrote:
> > > could you explain the intention of this change? Is it relevant to other changes in the patch? The original looks reasonable to me.
> > The motivation is keeping a bonus for matching the head of a segment even in case of single-case patterns that lack segmentation signals.
> > Previously, `p` from `up` would not get this bonus when matching `unique_[p]tr` even though intuitively it should as we do want to encourage those matches.
> so, iiuc, when `IsPatSingleCase` is true, it means that any character in the pattern can potentially be a head? could you add comment for this?
> We also had a stricter condition for `Pat[P] == Word[W]`, but now `ut` would get a bonus for `[u]nique_p[t]r`. Is this intended?
> so, iiuc, when IsPatSingleCase is true, it means that any character in the pattern can potentially be a head? could you add comment for this?
Exactly, we don't have any segmentation signals in a pattern and we don't want small drops in scores on each match a segment head, e.g. there's no reason matching `c` should have smaller bonus than `r` when matching `My[StrC]at` with `strc`.
Added a comment.
> We also had a stricter condition for Pat[P] == Word[W], but now ut would get a bonus for [u]nique_p[t]r. Is this intended?
This one is tricky and I'm less sure of it. I think the important bit is to give this bonus to `t` in case of `[u]nique_[pt]r`.
Otherwise we're making a match of the non-head character worse for no reason. I think the real problem in the previous implementation was its preference of prefix matches (the `P == W` condition), which gave an unfair advantage to `[up]per_bound` over `[u]nique_[p]tr`.
Matches in the middle of a segment are harshly penalized in the next line, so this small bonus does not seem to matter in practice for the case you mentioned. Maybe there's a better way to do this.
Comment at: clang-tools-extra/clangd/FuzzyMatch.cpp:273
if (WordRole[W] == Head) // Skipping a segment.
- S += 1;
- if (Last == Match) // Non-consecutive match.
- S += 2; // We'd rather skip a segment than split our match.
- return S;
+ return 1; // We want to keep it lower than the bonus for a consecutive
+ // match.
> nit: move this comment into into own line now that it's taking two?
Shrinked it to fit into a single line.
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
More information about the cfe-commits