[all-commits] [llvm/llvm-project] 139067: [SpecialCaseList] Remove TrigramIndex
Ellis Hoag via All-commits
all-commits at lists.llvm.org
Mon Jun 19 10:42:00 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 13906755427611fe2d1b1bf915e87b97ddffd236
https://github.com/llvm/llvm-project/commit/13906755427611fe2d1b1bf915e87b97ddffd236
Author: Ellis Hoag <ellis.sparky.hoag at gmail.com>
Date: 2023-06-19 (Mon, 19 Jun 2023)
Changed paths:
M llvm/include/llvm/Support/SpecialCaseList.h
R llvm/include/llvm/Support/TrigramIndex.h
M llvm/lib/Support/CMakeLists.txt
M llvm/lib/Support/SpecialCaseList.cpp
R llvm/lib/Support/TrigramIndex.cpp
M llvm/unittests/Support/CMakeLists.txt
R llvm/unittests/Support/TrigramIndexTest.cpp
M llvm/utils/gn/secondary/llvm/lib/Support/BUILD.gn
M llvm/utils/gn/secondary/llvm/unittests/Support/BUILD.gn
Log Message:
-----------
[SpecialCaseList] Remove TrigramIndex
`TrigramIndex` was added back in https://reviews.llvm.org/D27188 as an optimization to make `SpecialCaseList::match()` faster. I've found that `TrigramIndex` actually makes the function slower and it has no functional use, so we can remove it.
I grabbed the list of queries passed to `SpecialCaseList::match()` on a random very large file (`AArch64ISelLowering.cpp`) and measured the runtime to call `match()` on all of them with [this line](https://github.com/llvm/llvm-project/blob/8e1f820bb4eadf5c0704818f6063e0db1006e32d/llvm/lib/Support/SpecialCaseList.cpp#L64) disabled and then enabled.
```
$ hyperfine --warmup 3 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests' 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests'
Benchmark 1: GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests
Time (mean ± σ): 575.9 ms ± 20.3 ms [User: 573.1 ms, System: 2.7 ms]
Range (min … max): 555.5 ms … 620.0 ms 10 runs
Benchmark 2: GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests
Time (mean ± σ): 283.4 ms ± 6.7 ms [User: 280.3 ms, System: 3.0 ms]
Range (min … max): 277.0 ms … 294.9 ms 10 runs
Summary
'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests' ran
2.03 ± 0.09 times faster than 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests'
```
Using `perf` I found that most of the runtime in `TrigramIndex::isDefinitelyOut()` comes from a division operation that seems to come from `std::unordered_map`: https://github.com/llvm/llvm-project/blob/8e1f820bb4eadf5c0704818f6063e0db1006e32d/llvm/include/llvm/Support/TrigramIndex.h#L62
Removing `TrigramIndex` will make it easier to potentially switch to using `GlobPattern` instead of a full regex for `SpecialCaseList`. See discussion in https://reviews.llvm.org/D152762 for details.
Reviewed By: MaskRay, #sanitizers, vitalybuka
Differential Revision: https://reviews.llvm.org/D153171
More information about the All-commits
mailing list