[PATCH] D27188: Use trigrams to speed up SpecialCaseList.

Peter Collingbourne via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 29 22:54:03 PST 2016


pcc added a comment.

Nice!

Do you have before/after performance numbers for a large Chromium TU? Maybe also measure without the blacklist for a baseline.



================
Comment at: lib/Support/Regex.cpp:199
+  // Check for regex metacharacters other than '*' and '.'.
+  return Str.find_first_of(RegexAdvancedMetachars) == StringRef::npos;
+}
----------------
Does this belong in `Regex`? It feels more like an implementation detail of `TrigramIndex`.


================
Comment at: lib/Support/SpecialCaseList.cpp:34
+  std::vector<int> Counts;
+  std::unordered_map<unsigned, std::vector<size_t>> Index;
+
----------------
Maybe use a SmallVector instead of std::vector since the size is bounded?

This code also needs more comments. Please at least leave comments for TrigramIndex and its fields.


================
Comment at: lib/Support/SpecialCaseList.cpp:47
+    std::set<unsigned> Was;
+    size_t I = Counts.size();
+    unsigned Tri = 0;
----------------
Maybe fold this into the use? I found it a little confusing to use a variable named "I" for something other than a loop counter.


================
Comment at: lib/Support/SpecialCaseList.cpp:50
+    int Len = 0;
+    for (size_t J = 0; J < Regex.size(); J++) {
+      if (Regex[J] == '.' || Regex[J] == '*') {
----------------
Maybe use a range for loop here?


================
Comment at: lib/Support/SpecialCaseList.cpp:67
+      Cnt++;
+      if (!Was.count(Tri)) {
+        Index[Tri].push_back(I);
----------------
I wonder how important it is to test `Was` here. It looks like things would still work if you allow trigrams to have multiple identical indices, and as you mention in the comment above it isn't worth worrying about popular trigrams.


================
Comment at: lib/Support/SpecialCaseList.cpp:82
+  bool isDefinitelyOut(StringRef Query) const {
+    if (Defeated) {
+      return false;
----------------
Remove braces


https://reviews.llvm.org/D27188





More information about the llvm-commits mailing list