[PATCH] D50517: [clangd] Generate incomplete trigrams for the Dex index

Kirill Bobyrev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Aug 10 03:17:34 PDT 2018


kbobyrev added a comment.

In https://reviews.llvm.org/D50517#1194990, @ioeric wrote:

> In https://reviews.llvm.org/D50517#1194976, @kbobyrev wrote:
>
> > As discussed offline with @ilya-biryukov, the better approach would be to prefix match first symbols of each distinct identifier piece instead of prefix matching (just looking at the first letter of the identifier) the whole identifier.
> >
> > Example:
> >
> > - Query: `"u"`
> > - Symbols: `"unique_ptr"`, `"user"`, `"super_user"`
> >
> >   Current implementation would match `"unique_ptr"` and `"user"` only. Proposed implementation would match all three symbols, because the second piece of `"super_user"` starts with `u`.
>
>
> And in the case where users want to match `super_user`, I think it's reasonable to have users type two more characters and match it with `use`.


That would probably yield lower code completion quality for identifiers like `GtkWhatever` which might be very common in pure C projects and elsewhere. Also, Ilya mentioned that fuzzy matching filter would significantly increase the score of symbols which can be prefix matched and hence they would end up at the top if the quality is actually good. Another thing we can do is to boost prefix matched symbols if your concern is about them being removed after the initial filtering.

I'm personally leaning towards having unigrams for all segment starting symbols, but if you believe that it's certainly bad I can change that and in the future it will be rather trivial to switch if we decide to go backwards. What do you think?


https://reviews.llvm.org/D50517





More information about the cfe-commits mailing list