[PATCH] D44720: [clangd] Simplify fuzzy matcher (sequence alignment) by removing some condition checks.
Fangrui Song via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Mar 29 11:30:37 PDT 2018
MaskRay added inline comments.
================
Comment at: clangd/FuzzyMatch.cpp:340
+ A[WordN] = Miss;
+ for (int I = WordN, P = PatN; I > 0; I--) {
+ if (I == W)
----------------
sammccall wrote:
> MaskRay wrote:
> > sammccall wrote:
> > > sammccall wrote:
> > > > sammccall wrote:
> > > > > W is the right name in this file for a variable iterating over word indices, please don't change this.
> > > > > The new variable above could be EndW or so?
> > > > As far as I can see, this loop is setting `A[W+1:...] = Miss` and populating `A[0...W]` with the exsting logic.
> > > > I think this would be clearer as two loops, currently there's a lot of conditionals around Last that obscure what's actually happening.
> > > You've shifted P (and the old W, new I) by 1. This does reduce the number of +1 and -1 in this function, but it's inconsistent with how these are used elsewhere: P should be the index into Pat of the character that we're considering.
> > I don't understand the rationale not to use the shifted indices. The code actually use `Scores[P][W][*]` to mean the optimal match of the first `P` characters of the pattern with the first `W` characters of the word, not the position of the character.
> >
> > On the other hand, C++ reverse iterators use the shifted one for `for (I = rend(); I != rbegin(); ++I)`. The shifted one makes ending condition check easier.
> > I don't understand the rationale not to use the shifted indices
> The rationale is entirely consistency with the surrounding code. The consistency helps avoid off-by-one errors when similar loops have different conventions.
>
> In this file, when looping over word or pattern dimensions, P and W respectively are used for loop variables, and can be interpreted as indices into Pat/Word.
> Here the interpretation would be "did we match or miss character Word[W]"
`Scores[P][W][*]` is interpreted as how good it is if we align the first `P` characters of the pattern with the first `W` characters of the word.
Note the code uses `number of characters` instead of the position.
Here the new interpretation would be "what we should do for the last character of the first W characters"
Repository:
rCTE Clang Tools Extra
https://reviews.llvm.org/D44720
More information about the cfe-commits
mailing list