[PATCH] D44720: [clangd] Simplify fuzzy matcher (sequence alignment) by removing some condition checks.
Fangrui Song via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Mar 29 11:25:21 PDT 2018
MaskRay added inline comments.
================
Comment at: clangd/FuzzyMatch.cpp:230
void FuzzyMatcher::buildGraph() {
+ Scores[0][0][Miss] = Scores[0][0][Match] = {0, Miss};
for (int W = 0; W < WordN; ++W) {
----------------
MaskRay wrote:
> sammccall wrote:
> > MaskRay wrote:
> > > sammccall wrote:
> > > > why this change?
> > > > this has also been moved from the cheaper constructor to the more expensive per-match call. (also the diagonal assignment added in the next loop)
> > > >
> > > > Also, shouldn't [0][0][Match] be AwfulScore?
> > > >
> > > "The more expensive per-match call" is just two value assignments.
> > >
> > > I have removed the expensive table initialization in the constructor.
> > >
> > > [0][0][*] can be any value.
> > > "The more expensive per-match call" is just two value assignments.
> > Oops, sorry - by "more expensive" I mean "called thousands of times more often".
> >
> > > I have removed the expensive table initialization in the constructor.
> > I don't want to be rude, but I asked why you changed this, and you didn't answer. Unless there's a strong reason, I'd prefer to revert this change, as I find this harder to reason about.
> > (Roughly: in the old version of the code, any data that didn't need to change for the life of the object was initialized in the constructor. That way I didn't need to worry what was performance-critical and what wasn't - match() only did what was strictly necessary).
> >
> > > [0][0][*] can be any value.
> > Can you please explain why?
> > Oops, sorry - by "more expensive" I mean "called thousands of times more often".
>
> It is true that `Scores[0][0][Miss] = Scores[0][0][Match] = {0, Miss};` is the cost incurred for each word.
>
> But **it is not full table initialization**, it is just two variable assignments. And we will assign to other values of the first row `Scores[0][*][*]` in the following loop. The old scatters the table construction to **two places**, the constructor and this dynamic programming site.
> [0][0][*] can be any value.
Can you please explain why?
`Scores[0][0][*]` is the initial value which will be propagated to all other values in the table.
The relative difference of pairwise values in the table is a constant whatever initial value is chosen.
If you ignore the max clamp you used later, the initial value does not matter.
Repository:
rCTE Clang Tools Extra
https://reviews.llvm.org/D44720
More information about the cfe-commits
mailing list