[llvm] 245e607 - [LoopSink] Exit loop finding BBs to sink into early when possible (NFC) (#101115)
via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 30 13:08:59 PDT 2024
Author: Teresa Johnson
Date: 2024-07-30T13:08:56-07:00
New Revision: 245e6070daa191b1fc6ce05d8fc38a74f918159a
URL: https://github.com/llvm/llvm-project/commit/245e6070daa191b1fc6ce05d8fc38a74f918159a
DIFF: https://github.com/llvm/llvm-project/commit/245e6070daa191b1fc6ce05d8fc38a74f918159a.diff
LOG: [LoopSink] Exit loop finding BBs to sink into early when possible (NFC) (#101115)
As noted in the comments, findBBsToSinkInto is
O(UseBBs.size() * ColdLoopBBs.size())
A very large function with a huge loop was incurring a high compile time
in this code. The size of the ColdLoopBBs set was over 14K. There is a
limit on the size of the UseBBs set, but not the ColdLoopBBs (and adding
a limit for the latter actually slowed down some later passes).
This change exits the loop early once we detect that there is no further
refinement possible for the BBsToSinkInto set. This is possible because
the ColdLoopBBs set is sorted in ascending magnitude of frequency.
This cut down the LoopSinkPass time by around 33% (78s to just over
50s).
Added:
Modified:
llvm/lib/Transforms/Scalar/LoopSink.cpp
Removed:
################################################################################
diff --git a/llvm/lib/Transforms/Scalar/LoopSink.cpp b/llvm/lib/Transforms/Scalar/LoopSink.cpp
index 6eedf95e7575e..5c6ed8487bbd1 100644
--- a/llvm/lib/Transforms/Scalar/LoopSink.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopSink.cpp
@@ -144,7 +144,23 @@ findBBsToSinkInto(const Loop &L, const SmallPtrSetImpl<BasicBlock *> &UseBBs,
BBsToSinkInto.erase(DominatedBB);
}
BBsToSinkInto.insert(ColdestBB);
+ continue;
}
+ // Otherwise, see if we can stop the search through the cold BBs early.
+ // Since the ColdLoopBBs list is sorted in increasing magnitude of
+ // frequency the cold BB frequencies can only get larger. The
+ // BBsToSinkInto set can only get smaller and have a smaller
+ // adjustedSumFreq, due to the earlier checking. So once we find a cold BB
+ // with a frequency at least as large as the adjustedSumFreq of the
+ // current BBsToSinkInto set, the earlier frequency check can never be
+ // true for a future iteration. Note we could do check this more
+ // aggressively earlier, but in practice this ended up being more
+ // expensive overall (added checking to the critical path through the loop
+ // that often ended up continuing early due to an empty
+ // BBsDominatedByColdestBB set, and the frequency check there was false
+ // most of the time anyway).
+ if (adjustedSumFreq(BBsToSinkInto, BFI) <= BFI.getBlockFreq(ColdestBB))
+ break;
}
// Can't sink into blocks that have no valid insertion point.
More information about the llvm-commits
mailing list