[PATCH] D153171: [SpecialCaseList] Remove TrigramIndex

Ellis Hoag via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 16 19:36:36 PDT 2023


ellis created this revision.
Herald added subscribers: arphaman, hiraditya, kristof.beyls.
Herald added a project: All.
ellis edited the summary of this revision.
ellis added reviewers: phosek, MaskRay.
ellis published this revision for review.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

`TrigramIndex` was added back in https://reviews.llvm.org/D27188 as an optimization to make `SpecialCaseList::match()` faster. I've found that `TrigramIndex` actually makes the function slower and it has no functional use, so we can remove it.

I grabbed the list of queries passed to `SpecialCaseList::match()` on a random very large file (`AArch64ISelLowering.cpp`) and measured the runtime to call `match()` on all of them with this line <https://github.com/llvm/llvm-project/blob/8e1f820bb4eadf5c0704818f6063e0db1006e32d/llvm/lib/Support/SpecialCaseList.cpp#L64> disabled and then enabled.

  $ hyperfine --warmup 3 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests' 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests'
  Benchmark 1: GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests
    Time (mean ± σ):     575.9 ms ±  20.3 ms    [User: 573.1 ms, System: 2.7 ms]
    Range (min … max):   555.5 ms … 620.0 ms    10 runs
  
  Benchmark 2: GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests
    Time (mean ± σ):     283.4 ms ±   6.7 ms    [User: 280.3 ms, System: 3.0 ms]
    Range (min … max):   277.0 ms … 294.9 ms    10 runs
  
  Summary
    'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests' ran
      2.03 ± 0.09 times faster than 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests'

Using `perf` I found that most of the runtime in `TrigramIndex::isDefinitelyOut()` comes from a division operation that seems to come from `std::unordered_map`: https://github.com/llvm/llvm-project/blob/8e1f820bb4eadf5c0704818f6063e0db1006e32d/llvm/include/llvm/Support/TrigramIndex.h#L62

Removing `TrigramIndex` will make it easier to potentially switch to using `GlobPattern` instead of a full regex for `SpecialCaseList`. See discussion in https://reviews.llvm.org/D152762 for details.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D153171

Files:
  llvm/include/llvm/Support/SpecialCaseList.h
  llvm/include/llvm/Support/TrigramIndex.h
  llvm/lib/Support/CMakeLists.txt
  llvm/lib/Support/SpecialCaseList.cpp
  llvm/lib/Support/TrigramIndex.cpp
  llvm/unittests/Support/CMakeLists.txt
  llvm/unittests/Support/TrigramIndexTest.cpp
  llvm/utils/gn/secondary/llvm/lib/Support/BUILD.gn
  llvm/utils/gn/secondary/llvm/unittests/Support/BUILD.gn

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D153171.532295.patch
Type: text/x-patch
Size: 14005 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230617/2ee66f14/attachment.bin>


More information about the llvm-commits mailing list