[llvm] 0c6784c - [MemDep] Optimize SortNonLocalDepInfoCache sorting strategy for large caches with few unsorted entries (#143107)
via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 25 07:45:04 PDT 2025
Author: DingdWang
Date: 2025-07-25T16:45:01+02:00
New Revision: 0c6784c9514d0ddb257bf0fd797969e0ae602882
URL: https://github.com/llvm/llvm-project/commit/0c6784c9514d0ddb257bf0fd797969e0ae602882
DIFF: https://github.com/llvm/llvm-project/commit/0c6784c9514d0ddb257bf0fd797969e0ae602882.diff
LOG: [MemDep] Optimize SortNonLocalDepInfoCache sorting strategy for large caches with few unsorted entries (#143107)
During compilation of large files with many branches, I observed that
the function `SortNonLocalDepInfoCache` in `MemoryDependenceAnalysis`
becomes a significant performance bottleneck. This is because
`Cache.size()` can be very large (around 20,000), but only a small
number of entries (approximately 5 to 8) actually need sorting. The
original implementation performs a full sort in all cases, which is
inefficient.
This patch introduces a lightweight heuristic to quickly estimate the
number of unsorted entries and choose a more efficient sorting method
accordingly.
As a result, the GVN pass runtime on a large file is reduced from
approximately 26.3 minutes to 16.5 minutes.
Added:
Modified:
llvm/lib/Analysis/MemoryDependenceAnalysis.cpp
Removed:
################################################################################
diff --git a/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp b/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp
index 3aa9909df8e55..2b0f212bff01a 100644
--- a/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/MemoryDependenceAnalysis.cpp
@@ -983,33 +983,37 @@ MemDepResult MemoryDependenceResults::getNonLocalInfoForBlock(
static void
SortNonLocalDepInfoCache(MemoryDependenceResults::NonLocalDepInfo &Cache,
unsigned NumSortedEntries) {
- switch (Cache.size() - NumSortedEntries) {
- case 0:
- // done, no new entries.
- break;
- case 2: {
- // Two new entries, insert the last one into place.
- NonLocalDepEntry Val = Cache.back();
- Cache.pop_back();
- MemoryDependenceResults::NonLocalDepInfo::iterator Entry =
- std::upper_bound(Cache.begin(), Cache.end() - 1, Val);
- Cache.insert(Entry, Val);
- [[fallthrough]];
+
+ // If only one entry, don't sort.
+ if (Cache.size() < 2)
+ return;
+
+ unsigned s = Cache.size() - NumSortedEntries;
+
+ // If the cache is already sorted, don't sort it again.
+ if (s == 0)
+ return;
+
+ // If no entry is sorted, sort the whole cache.
+ if (NumSortedEntries == 0) {
+ llvm::sort(Cache);
+ return;
}
- case 1:
- // One new entry, Just insert the new value at the appropriate position.
- if (Cache.size() != 1) {
+
+ // If the number of unsorted entires is small and the cache size is big, using
+ // insertion sort is faster. Here use Log2_32 to quickly choose the sort
+ // method.
+ if (s < Log2_32(Cache.size())) {
+ while (s > 0) {
NonLocalDepEntry Val = Cache.back();
Cache.pop_back();
MemoryDependenceResults::NonLocalDepInfo::iterator Entry =
- llvm::upper_bound(Cache, Val);
+ std::upper_bound(Cache.begin(), Cache.end() - s + 1, Val);
Cache.insert(Entry, Val);
+ s--;
}
- break;
- default:
- // Added many values, do a full scale sort.
+ } else {
llvm::sort(Cache);
- break;
}
}
More information about the llvm-commits
mailing list