Remove redundant checks added by loop unrolling
Benjamin Poulain
benjamin at webkit.org
Sun Dec 7 19:57:44 PST 2014
I looked into this today.
Following your advice, I made something local to loop unrolling (patch
attached). What this does is look if the previous block already checks
for a trip count of zero.
This seems a bit hackish, it works for basic for() and while() loops. Is
there an analysis pass I could use to track properties of values? I
thought ScalarEvolution could help but it seems to only track that value
inside the loop.
Benjamin
On 12/2/14, 3:42 PM, Philip Reames wrote:
> Benjamin,
>
> This doesn't really feel like the right approach to the problem.
> Adding another run of GVN after every unrolling opportunity is a
> fairly heavy weight hammer. Have you looked to see if there's a
> simple change you could make to the unroller itself to handle this
> case? I haven't looked at the code, but my suspicion is there would be.
>
> Philip
>
> On 11/29/2014 12:59 PM, Benjamin Poulain wrote:
>> Hi,
>>
>> I noticed the optimized code generated for loops often include
>> redundant checks that the size is not zero.
>>
>> For example, code like this:
>> unsigned total = 0;
>> for (unsigned i = 0; i < size; ++i) { ...
>>
>> when optimized, generate the following prologue:
>> entry:
>> %cmp4 = icmp eq i32 %size, 0
>> br i1 %cmp4, label %for.end, label %for.body.lr.ph
>>
>> for.body.lr.ph: ; preds = %entry
>> %0 = add i32 %size, -1
>> %xtraiter = and i32 %size, 3
>> %lcmp.mod = icmp ne i32 %xtraiter, 0
>> %lcmp.overflow = icmp eq i32 %size, 0
>> %lcmp.or = or i1 %lcmp.overflow, %lcmp.mod
>> br i1 %lcmp.or, label %for.body.prol, label %for.body.lr.ph.split
>>
>> Notice the redundant test for "icmp eq i32 %size, 0". When compiled
>> to target, we have one redundant check per loop that can never be true.
>>
>> The extra check for size==0 comes from LoopUnroll. It is never
>> optimized out because LoopUnrollPass is run after the passes that
>> could eliminate redundant conditions.
>>
>> I have attached a patch that fixes the problem. For every pass of
>> LoopUnroll, I added a GVN pass to remove every redundant conditions
>> and branches. My patch is without test, I need guidance on how to
>> properly test this.
>>
>> Benjamin
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141207/817a906d/attachment.html>
-------------- next part --------------
diff --git a/lib/Transforms/Utils/LoopUnrollRuntime.cpp b/lib/Transforms/Utils/LoopUnrollRuntime.cpp
index 3d91336..e543d10 100644
--- a/lib/Transforms/Utils/LoopUnrollRuntime.cpp
+++ b/lib/Transforms/Utils/LoopUnrollRuntime.cpp
@@ -127,6 +127,40 @@ static void ConnectProlog(Loop *L, Value *TripCount, unsigned Count,
InsertPt->eraseFromParent();
}
+static bool IsKnownNonZero(const Value* TestedValue, BasicBlock* Block)
+{
+ BasicBlock *Predecessor = Block->getSinglePredecessor();
+ if (!Predecessor)
+ return false;
+
+ BranchInst *PredecessorBR = cast<BranchInst>(Predecessor->getTerminator());
+ if (!PredecessorBR->isConditional())
+ return false;
+
+ if (ICmpInst *Cond = dyn_cast<ICmpInst>(PredecessorBR->getCondition())) {
+ ConstantInt *Constant = nullptr;
+ Value *CmpValue = nullptr;
+ if ((Constant = dyn_cast<ConstantInt>(Cond->getOperand(0)))) {
+ if (Constant->isZero()) {
+ CmpValue = Cond->getOperand(1);
+ }
+ } else if ((Constant = dyn_cast<ConstantInt>(Cond->getOperand(1)))) {
+ if (Constant->isZero()) {
+ CmpValue = Cond->getOperand(0);
+ }
+ }
+
+ if (CmpValue == TestedValue) {
+ ICmpInst::Predicate Pred = Cond->getPredicate();
+ if (Pred == ICmpInst::ICMP_NE && PredecessorBR->getSuccessor(0) == Block)
+ return true;
+ if (Pred == ICmpInst::ICMP_EQ && PredecessorBR->getSuccessor(1) == Block)
+ return true;
+ }
+ }
+ return false;
+}
+
/// Create a clone of the blocks in a loop and connect them together.
/// If UnrollProlog is true, loop structure will not be cloned, otherwise a new
/// loop will be created including all cloned blocks, and the iterator of it
@@ -336,9 +370,14 @@ bool llvm::UnrollRuntimeLoopProlog(Loop *L, unsigned Count, LoopInfo *LI,
// Check if for no extra iterations, then jump to cloned/unrolled loop.
// We have to check that the trip count computation didn't overflow when
// adding one to the backedge taken count.
- Value *LCmp = B.CreateIsNotNull(ModVal, "lcmp.mod");
- Value *OverflowCheck = B.CreateIsNull(TripCount, "lcmp.overflow");
- Value *BranchVal = B.CreateOr(OverflowCheck, LCmp, "lcmp.or");
+ Value *BranchVal = nullptr;
+ if (IsKnownNonZero(TripCount, PH)) {
+ BranchVal = B.CreateIsNotNull(ModVal, "lcmp.mod");
+ } else {
+ Value *LCmp = B.CreateIsNotNull(ModVal, "lcmp.mod");
+ Value *OverflowCheck = B.CreateIsNull(TripCount, "lcmp.overflow");
+ BranchVal = B.CreateOr(OverflowCheck, LCmp, "lcmp.or");
+ }
// Branch to either the extra iterations or the cloned/unrolled loop
// We will fix up the true branch label when adding loop body copies
More information about the llvm-commits
mailing list