[PATCH] D26527: Use profile info to adjust loop unroll threshold.
Michael Zolotukhin via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 16 14:25:01 PST 2016
mzolotukhin accepted this revision.
mzolotukhin added a comment.
This revision is now accepted and ready to land.
The change looks good to me, thank you! I'm assuming you and Michael will figure out which version of `getLoopEstimatedTripCount` you want to use, other than that I have mostly nitpicky comments below.
BTW, do you have performance testing results for this patch? I'd expect some improvements in code-size and compile-time with these changes.
Michael
================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:757
+ if (L->getHeader()->getParent()->getEntryCount() && TripCount == 0) {
+ auto ProfileTripCount = getLoopEstimatedTripCount(L);
----------------
Please add some comment here.
================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:758-759
+ if (L->getHeader()->getParent()->getEntryCount() && TripCount == 0) {
+ auto ProfileTripCount = getLoopEstimatedTripCount(L);
+ if (ProfileTripCount) {
+ if (*ProfileTripCount < FlatLoopTripCountThreshold)
----------------
`if (auto ProfileTripCount = getLoopEstimatedTripCount(L))` ?
================
Comment at: lib/Transforms/Utils/LoopUtils.cpp:1071
+
+Optional<unsigned> llvm::getLoopEstimatedTripCount(Loop *L) {
+ // Only support loops with a unique exiting block, and a latch.
----------------
This version and the one from D25963 should eventually become the same, right?
================
Comment at: lib/Transforms/Utils/LoopUtils.cpp:1077-1078
+ // Get the branch weights for the the loop's backedge.
+ BranchInst *LatchBR =
+ dyn_cast<BranchInst>(L->getLoopLatch()->getTerminator());
+ if (!LatchBR)
----------------
Probably we also need to check that the latch is exiting (i.e. the branch is conditional).
================
Comment at: test/Transforms/LoopUnroll/unroll-heuristics-pgo.ll:5
+
+; CHECK-LABEL: bar_prof
+; CHECK: loop.prol
----------------
Please add `@` to the name.
================
Comment at: test/Transforms/LoopUnroll/unroll-heuristics-pgo.ll:6
+; CHECK-LABEL: bar_prof
+; CHECK: loop.prol
+define i32 @bar_prof(i32* noalias nocapture readonly %src, i64 %c) !prof !1 {
----------------
Is it enough to just check presence of the prologue? Maybe explicitly check that we have several copies of some instruction?
https://reviews.llvm.org/D26527
More information about the llvm-commits
mailing list