Remove redundant checks added by loop unrolling

Benjamin Poulain benjamin at webkit.org
Sun Dec 7 19:57:44 PST 2014


I looked into this today.

Following your advice, I made something local to loop unrolling (patch 
attached). What this does is look if the previous block already checks 
for a trip count of zero.

This seems a bit hackish, it works for basic for() and while() loops. Is 
there an analysis pass I could use to track properties of values? I 
thought ScalarEvolution could help but it seems to only track that value 
inside the loop.

Benjamin

On 12/2/14, 3:42 PM, Philip Reames wrote:
> Benjamin,
>
> This doesn't really feel like the right approach to the problem. 
> Adding another run of GVN after every unrolling opportunity is a 
> fairly heavy weight hammer.  Have you looked to see if there's a 
> simple change you could make to the unroller itself to handle this 
> case?  I haven't looked at the code, but my suspicion is there would be.
>
> Philip
>
> On 11/29/2014 12:59 PM, Benjamin Poulain wrote:
>> Hi,
>>
>> I noticed the optimized code generated for loops often include 
>> redundant checks that the size is not zero.
>>
>> For example, code like this:
>>   unsigned total = 0;
>>   for (unsigned i = 0; i < size; ++i) { ...
>>
>> when optimized, generate the following prologue:
>>   entry:
>>     %cmp4 = icmp eq i32 %size, 0
>>     br i1 %cmp4, label %for.end, label %for.body.lr.ph
>>
>>   for.body.lr.ph:                                   ; preds = %entry
>>    %0 = add i32 %size, -1
>>     %xtraiter = and i32 %size, 3
>>     %lcmp.mod = icmp ne i32 %xtraiter, 0
>>     %lcmp.overflow = icmp eq i32 %size, 0
>>     %lcmp.or = or i1 %lcmp.overflow, %lcmp.mod
>>     br i1 %lcmp.or, label %for.body.prol, label %for.body.lr.ph.split
>>
>> Notice the redundant test for "icmp eq i32 %size, 0". When compiled 
>> to target, we have one redundant check per loop that can never be true.
>>
>> The extra check for size==0 comes from LoopUnroll. It is never 
>> optimized out because LoopUnrollPass is run after the passes that 
>> could eliminate redundant conditions.
>>
>> I have attached a patch that fixes the problem. For every pass of 
>> LoopUnroll, I added a GVN pass to remove every redundant conditions 
>> and branches. My patch is without test, I need guidance on how to 
>> properly test this.
>>
>> Benjamin
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141207/817a906d/attachment.html>
-------------- next part --------------
diff --git a/lib/Transforms/Utils/LoopUnrollRuntime.cpp b/lib/Transforms/Utils/LoopUnrollRuntime.cpp
index 3d91336..e543d10 100644
--- a/lib/Transforms/Utils/LoopUnrollRuntime.cpp
+++ b/lib/Transforms/Utils/LoopUnrollRuntime.cpp
@@ -127,6 +127,40 @@ static void ConnectProlog(Loop *L, Value *TripCount, unsigned Count,
   InsertPt->eraseFromParent();
 }
 
+static bool IsKnownNonZero(const Value* TestedValue, BasicBlock* Block)
+{
+  BasicBlock *Predecessor = Block->getSinglePredecessor();
+  if (!Predecessor)
+    return false;
+
+  BranchInst *PredecessorBR = cast<BranchInst>(Predecessor->getTerminator());
+  if (!PredecessorBR->isConditional())
+    return false;
+
+  if (ICmpInst *Cond = dyn_cast<ICmpInst>(PredecessorBR->getCondition())) {
+    ConstantInt *Constant = nullptr;
+    Value *CmpValue = nullptr;
+    if ((Constant = dyn_cast<ConstantInt>(Cond->getOperand(0)))) {
+      if (Constant->isZero()) {
+        CmpValue = Cond->getOperand(1);
+      }
+    } else if ((Constant = dyn_cast<ConstantInt>(Cond->getOperand(1)))) {
+      if (Constant->isZero()) {
+        CmpValue = Cond->getOperand(0);
+      }
+    }
+
+    if (CmpValue == TestedValue) {
+      ICmpInst::Predicate Pred = Cond->getPredicate();
+      if (Pred == ICmpInst::ICMP_NE && PredecessorBR->getSuccessor(0) == Block)
+        return true;
+      if (Pred == ICmpInst::ICMP_EQ && PredecessorBR->getSuccessor(1) == Block)
+        return true;
+    }
+  }
+  return false;
+}
+
 /// Create a clone of the blocks in a loop and connect them together.
 /// If UnrollProlog is true, loop structure will not be cloned, otherwise a new
 /// loop will be created including all cloned blocks, and the iterator of it
@@ -336,9 +370,14 @@ bool llvm::UnrollRuntimeLoopProlog(Loop *L, unsigned Count, LoopInfo *LI,
   // Check if for no extra iterations, then jump to cloned/unrolled loop.
   // We have to check that the trip count computation didn't overflow when
   // adding one to the backedge taken count.
-  Value *LCmp = B.CreateIsNotNull(ModVal, "lcmp.mod");
-  Value *OverflowCheck = B.CreateIsNull(TripCount, "lcmp.overflow");
-  Value *BranchVal = B.CreateOr(OverflowCheck, LCmp, "lcmp.or");
+  Value *BranchVal = nullptr;
+  if (IsKnownNonZero(TripCount, PH)) {
+    BranchVal = B.CreateIsNotNull(ModVal, "lcmp.mod");
+  } else {
+    Value *LCmp = B.CreateIsNotNull(ModVal, "lcmp.mod");
+    Value *OverflowCheck = B.CreateIsNull(TripCount, "lcmp.overflow");
+    BranchVal = B.CreateOr(OverflowCheck, LCmp, "lcmp.or");
+  }
 
   // Branch to either the extra iterations or the cloned/unrolled loop
   // We will fix up the true branch label when adding loop body copies


More information about the llvm-commits mailing list