[PATCH] D50700: [clangd] Generate better incomplete bigrams for the Dex index

Eric Liu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Aug 16 06:50:08 PDT 2018


ioeric added inline comments.


================
Comment at: clang-tools-extra/unittests/clangd/DexIndexTests.cpp:324
 
+  EXPECT_THAT(generateQueryTrigrams("u_p"), trigramsAre({"up$"}));
+  EXPECT_THAT(generateQueryTrigrams("_u_p"), trigramsAre({"_u_"}));
----------------
kbobyrev wrote:
> ioeric wrote:
> > I'm not sure if this is correct. If users have explicitly typed `_`, they are likely to want a `_` there. You mentioned in the patch summary that users might want to match two heads with this. Could you provide an example?
> The particular example I had on my mind was `unique_ptr`. Effectively, if the query is `SC` then `StrCat` would be matched (because the incomplete trigram would be `sc$` and two heads from the symbol identifier would also yield `sc$`), but for `u_p`, `unique_ptr` is currently not matched.
> 
> On the other hand, if there's something like `m_field` and user types `mf` or `m_f` it will be matched in both cases, because `m_field` yields `mf$` in the index build stage, so this change doesn't decrease code completion quality for such cases.
The problem is that `u_p` can now match `upXXX` with `up$`, which might be a bit surprising for users. 

It seems to me that another way way to handle this is to generate `u_p` trigram for `unique_ptr`. Have we considered including separators like `_` in trigrams? 


https://reviews.llvm.org/D50700





More information about the cfe-commits mailing list