[llvm] [clang] [clang-tools-extra] [LoopVectorize] Refine runtime memory check costs when there is an outer loop (PR #76034)
Rin Dobrescu via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 18 04:11:21 PST 2024
================
@@ -2076,16 +2081,61 @@ class GeneratedRTChecks {
LLVM_DEBUG(dbgs() << " " << C << " for " << I << "\n");
RTCheckCost += C;
}
- if (MemCheckBlock)
+ if (MemCheckBlock) {
+ InstructionCost MemCheckCost = 0;
for (Instruction &I : *MemCheckBlock) {
if (MemCheckBlock->getTerminator() == &I)
continue;
InstructionCost C =
TTI->getInstructionCost(&I, TTI::TCK_RecipThroughput);
LLVM_DEBUG(dbgs() << " " << C << " for " << I << "\n");
- RTCheckCost += C;
+ MemCheckCost += C;
}
+ // If the runtime memory checks are being created inside an outer loop
+ // we should find out if these checks are outer loop invariant. If so,
+ // the checks will likely be hoisted out and so the effective cost will
+ // reduce according to the outer loop trip count.
+ if (OuterLoop) {
+ ScalarEvolution *SE = MemCheckExp.getSE();
+ // TODO: We could refine this further by analysing every individual
+ // memory check, since there could be a mixture of loop variant and
+ // invariant checks that mean the final condition is variant. However,
+ // I think it would need further analysis to prove this is beneficial.
+ const SCEV *Cond = SE->getSCEV(MemRuntimeCheckCond);
+ if (SE->isLoopInvariant(Cond, OuterLoop)) {
+ // It seems reasonable to assume that we can reduce the effective
+ // cost of the checks even when we know nothing about the trip
+ // count. Here I've assumed that the outer loop executes at least
+ // twice.
+ unsigned BestTripCount = 2;
+
+ // If exact trip count is known use that.
+ if (unsigned SmallTC = SE->getSmallConstantTripCount(OuterLoop))
+ BestTripCount = SmallTC;
+ else if (LoopVectorizeWithBlockFrequency) {
+ // Else use profile data if available.
+ if (auto EstimatedTC = getLoopEstimatedTripCount(OuterLoop))
+ BestTripCount = *EstimatedTC;
+ }
+
+ InstructionCost NewMemCheckCost = MemCheckCost / BestTripCount;
+
+ // Let's ensure the cost is always at least 1.
+ NewMemCheckCost = std::max(*NewMemCheckCost.getValue(), (long)1);
----------------
Rin18 wrote:
There's a buildbot failure at this line. Has that been fixed? Might be worth getting that triggered again.
https://github.com/llvm/llvm-project/pull/76034
More information about the llvm-commits
mailing list