[PATCH] D113995: [clangd] Dex Trigrams: Improve query trigram generation

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Nov 29 15:47:50 PST 2021

sammccall added a comment.

Basically LG now, some of the changes are a bit mysterious to me though

Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:47
+  const size_t NPOS = std::numeric_limits<size_t>::max();
+  llvm::SmallVector<std::pair<size_t, size_t>> Next(LowercaseIdentifier.size());
+  size_t NextTail = NPOS, NextHead = NPOS;
why the change from array to pair and 0 to NPOS?

Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:82
+  //
+  // - First separator + first head
+  // - First head
Fist -> first
You're missing first separator only

I think examples might be clearer than a description of these.

Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:102
+                  LowercaseIdentifier[NextHead]));
+    Position = NextHead;
+  }
(if the change to NPOS is just to avoid hitting Position == 0 on the first iteration, you could check it here instead)

Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:148
+  // For queries with very few letters, emulate what generateIdentifierTrigrams
+  // outputs for the beginning of the Identifier.
Since the query trigram corresponds so closely to the query for short queries, I think it makes more sense to consider it the other way: generateIdentifierTrigrams is deciding which queries it wants to match and emulating this function.

Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:151
+  if (UniqueTrigrams.empty()) {
+    // If a bigram can't be formed, we might prepend a separator.
+    std::string Result(1, LowercaseQuery.front());
nit: by prepending a separator, we form a bigram!
If a bigram can't be formed out of letters/numbers...

  rG LLVM Github Monorepo



More information about the cfe-commits mailing list