[llvm-commits] [llvm] r61178 - /llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp

Dale Johannesen dalej at apple.com
Wed Dec 17 16:57:23 PST 2008


Author: johannes
Date: Wed Dec 17 18:57:22 2008
New Revision: 61178

URL: http://llvm.org/viewvc/llvm-project?rev=61178&view=rev
Log:
Fix the time regression I introduced in 464.h264ref with
my last patch to this file.

The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop.  This was extra bad
because register pressure later forced both base and IV into
memory.  Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this.  However,
there were side effects....

It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before.  (This patch does not handle 
all the cases where this can happen.)  And when inserting 
new code that feeds into a PHI, it's right to put such 
code at the original location rather than in the PHI's 
immediate predecessor(s) when the original location is outside 
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.

Everything above is exercised in
CodeGen/X86/lsr-negative-stride.ll (and ifcvt4 in ARM which is
the same IR).


Modified:
    llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp

Modified: llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp?rev=61178&r1=61177&r2=61178&view=diff

==============================================================================
--- llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp (original)
+++ llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp Wed Dec 17 18:57:22 2008
@@ -508,14 +508,22 @@
     if (isa<PHINode>(User) && Processed.count(User))
       continue;
 
-    // If this is an instruction defined in a nested loop, or outside this loop,
-    // don't recurse into it.
+    // Descend recursively, but not into PHI nodes outside the current loop.
+    // It's important to see the entire expression outside the loop to get
+    // choices that depend on addressing mode use right, although we won't
+    // consider references ouside the loop in all cases.
+    // If User is already in Processed, we don't want to recurse into it again,
+    // but do want to record a second reference in the same instruction.
     bool AddUserToIVUsers = false;
     if (LI->getLoopFor(User->getParent()) != L) {
-      DOUT << "FOUND USER in other loop: " << *User
-           << "   OF SCEV: " << *ISE << "\n";
-      AddUserToIVUsers = true;
-    } else if (!AddUsersIfInteresting(User, L, Processed)) {
+      if (isa<PHINode>(User) || Processed.count(User) ||
+          !AddUsersIfInteresting(User, L, Processed)) {
+        DOUT << "FOUND USER in other loop: " << *User
+             << "   OF SCEV: " << *ISE << "\n";
+        AddUserToIVUsers = true;
+      }
+    } else if (Processed.count(User) || 
+               !AddUsersIfInteresting(User, L, Processed)) {
       DOUT << "FOUND USER: " << *User
            << "   OF SCEV: " << *ISE << "\n";
       AddUserToIVUsers = true;
@@ -704,34 +712,45 @@
   PHINode *PN = cast<PHINode>(Inst);
   for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) {
     if (PN->getIncomingValue(i) == OperandValToReplace) {
-      // If this is a critical edge, split the edge so that we do not insert the
-      // code on all predecessor/successor paths.  We do this unless this is the
-      // canonical backedge for this loop, as this can make some inserted code
-      // be in an illegal position.
-      BasicBlock *PHIPred = PN->getIncomingBlock(i);
-      if (e != 1 && PHIPred->getTerminator()->getNumSuccessors() > 1 &&
-          (PN->getParent() != L->getHeader() || !L->contains(PHIPred))) {
-        
-        // First step, split the critical edge.
-        SplitCriticalEdge(PHIPred, PN->getParent(), P, false);
-            
-        // Next step: move the basic block.  In particular, if the PHI node
-        // is outside of the loop, and PredTI is in the loop, we want to
-        // move the block to be immediately before the PHI block, not
-        // immediately after PredTI.
-        if (L->contains(PHIPred) && !L->contains(PN->getParent())) {
-          BasicBlock *NewBB = PN->getIncomingBlock(i);
-          NewBB->moveBefore(PN->getParent());
+      // If the original expression is outside the loop, put the replacement
+      // code in the same place as the original expression,
+      // which need not be an immediate predecessor of this PHI.  This way we 
+      // need only one copy of it even if it is referenced multiple times in
+      // the PHI.  We don't do this when the original expression is inside the
+      // loop because multiple copies sometimes do useful sinking of code in that
+      // case(?).
+      Instruction *OldLoc = dyn_cast<Instruction>(OperandValToReplace);
+      if (L->contains(OldLoc->getParent())) {
+        // If this is a critical edge, split the edge so that we do not insert the
+        // code on all predecessor/successor paths.  We do this unless this is the
+        // canonical backedge for this loop, as this can make some inserted code
+        // be in an illegal position.
+        BasicBlock *PHIPred = PN->getIncomingBlock(i);
+        if (e != 1 && PHIPred->getTerminator()->getNumSuccessors() > 1 &&
+            (PN->getParent() != L->getHeader() || !L->contains(PHIPred))) {
+
+          // First step, split the critical edge.
+          SplitCriticalEdge(PHIPred, PN->getParent(), P, false);
+
+          // Next step: move the basic block.  In particular, if the PHI node
+          // is outside of the loop, and PredTI is in the loop, we want to
+          // move the block to be immediately before the PHI block, not
+          // immediately after PredTI.
+          if (L->contains(PHIPred) && !L->contains(PN->getParent())) {
+            BasicBlock *NewBB = PN->getIncomingBlock(i);
+            NewBB->moveBefore(PN->getParent());
+          }
+
+          // Splitting the edge can reduce the number of PHI entries we have.
+          e = PN->getNumIncomingValues();
         }
-        
-        // Splitting the edge can reduce the number of PHI entries we have.
-        e = PN->getNumIncomingValues();
       }
-
       Value *&Code = InsertedCode[PN->getIncomingBlock(i)];
       if (!Code) {
         // Insert the code into the end of the predecessor block.
-        Instruction *InsertPt = PN->getIncomingBlock(i)->getTerminator();
+        Instruction *InsertPt = (L->contains(OldLoc->getParent())) ?
+                                PN->getIncomingBlock(i)->getTerminator() :
+                                OldLoc->getParent()->getTerminator();
         Code = InsertCodeForBaseAtPosition(NewBase, Rewriter, InsertPt, L);
 
         // Adjust the type back to match the PHI. Note that we can't use
@@ -1168,6 +1187,10 @@
 /// mode scale component and optional base reg. This allows the users of
 /// this stride to be rewritten as prev iv * factor. It returns 0 if no
 /// reuse is possible.  Factors can be negative on same targets, e.g. ARM.
+///
+/// If all uses are outside the loop, we don't require that all multiplies
+/// be folded into the addressing mode; a multiply (executed once) outside
+/// the loop is better than another IV within.  Well, usually.
 int64_t LoopStrengthReduce::CheckForIVReuse(bool HasBaseReg,
                                 bool AllUsesAreAddresses,
                                 bool AllUsesAreOutsideLoop,
@@ -1205,6 +1228,30 @@
             return Scale;
           }
     }
+  } else if (AllUsesAreOutsideLoop) {
+    // Accept nonconstant strides here; it is really really right to substitute
+    // an existing IV if we can.
+    // Special case, old IV is -1*x and this one is x.  Can treat this one as
+    // -1*old.
+    for (unsigned NewStride = 0, e = StrideOrder.size(); NewStride != e;
+         ++NewStride) {
+      std::map<SCEVHandle, IVsOfOneStride>::iterator SI = 
+                IVsByStride.find(StrideOrder[NewStride]);
+      if (SI == IVsByStride.end()) 
+        continue;
+      if (SCEVMulExpr *ME = dyn_cast<SCEVMulExpr>(SI->first))
+        if (SCEVConstant *SC = dyn_cast<SCEVConstant>(ME->getOperand(0)))
+          if (Stride == ME->getOperand(1) &&
+              SC->getValue()->getSExtValue() == -1LL)
+            for (std::vector<IVExpr>::iterator II = SI->second.IVs.begin(),
+                   IE = SI->second.IVs.end(); II != IE; ++II)
+              // Accept nonzero base here.
+              // Only reuse previous IV if it would not require a type conversion.
+              if (!RequiresTypeConversion(II->Base->getType(), Ty)) {
+                IV = *II;
+                return -1;
+              }
+    }
   }
   return 0;
 }
@@ -1549,6 +1596,18 @@
             !cast<ConstantInt>(CommonBaseV)->isZero())
           RewriteExpr = SE->getAddExpr(RewriteExpr,
                                        SE->getUnknown(CommonBaseV));
+        // If we're reusing an IV with a nonzero base (currently this happens
+        // only when all reuses are outside the loop) subtract out that base here
+        // This is the reverse of the above; the base HAS been used to initialize
+        // the PHI node but we don't want it here.
+        // (If the RewriteFactor is negative, we're effectively negating the
+        // old IV in this use, so we add the base instead of subtract.)
+        if (!ReuseIV.Base->isZero()) {
+          if (RewriteFactor < 0)
+            RewriteExpr = SE->getAddExpr(RewriteExpr, ReuseIV.Base);
+          else
+            RewriteExpr = SE->getMinusSCEV(RewriteExpr, ReuseIV.Base);
+        }
       }
 
       // Now that we know what we need to do, insert code before User for the





More information about the llvm-commits mailing list