[PATCH] D16838: [LoopUnrolling] Try harder to avoid rebuilding LCSSA when possible.

Mikhail Zolotukhin via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 3 14:42:38 PST 2016


> On Feb 3, 2016, at 1:32 PM, Michael Zolotukhin via llvm-commits <llvm-commits at lists.llvm.org> wrote:
> 
> Hi Justin,
> 
> Thanks for taking a look, please find my comments below. I’ll update the patch shortly!
Hi Justin,

I've just updated the patch, please take a look! Some comments below as well:

> 
> Michael
>> On Feb 3, 2016, at 11:56 AM, Justin Bogner <mail at justinbogner.com <mailto:mail at justinbogner.com>> wrote:
>> 
>> Michael Zolotukhin <mzolotukhin at apple.com <mailto:mzolotukhin at apple.com>> writes:
>>> mzolotukhin created this revision.
>>> mzolotukhin added reviewers: chandlerc, hfinkel, dexonsmith, bogner, joker.eph.
>>> mzolotukhin added a subscriber: llvm-commits.
>>> 
>>> In r255133 (reapplied r253126) we started to avoid redundant
>>> recomputation of LCSSA after loop-unrolling. This patch moves one step
>>> further in this direction - now we can avoid it for much wider range
>>> of loops, as we start to look at IR and try to figure out if the
>>> transformation actually breaks LCSSA phis or makes it necessary to
>>> insert new ones.
>>> 
>>> In future we might go even further and try to fix LCSSA in-place
>>> rather than rebuilding it, but I'm not quite sure yet that it's always
>>> possible (and computationally cheaper). Anyway, this patch seems to be
>>> aligned with that direction.
>>> 
>>> One of the most important use-cases that previous implementation
>>> didn't handle is loops with calls that might throw an exception. Such
>>> loops have exits from entire loop nest, but we still don't need to
>>> recompute LCSSA after unrolling, as such exits usually don't contain
>>> LCSSA phis.
>>> 
>>> The patch was tested on LNT testsuite + SPECS, neither failures nor
>>> significant compile time changes were spotted.
>>> 
>>> http://reviews.llvm.org/D16838 <http://reviews.llvm.org/D16838>
>>> 
>>> Files:
>>>  lib/Transforms/Utils/LoopUnroll.cpp
>>> 
>>> Index: lib/Transforms/Utils/LoopUnroll.cpp
>>> ===================================================================
>>> --- lib/Transforms/Utils/LoopUnroll.cpp
>>> +++ lib/Transforms/Utils/LoopUnroll.cpp
>>> @@ -218,10 +218,16 @@
>>>   bool CompletelyUnroll = Count == TripCount;
>>>   SmallVector<BasicBlock *, 4> ExitBlocks;
>>>   L->getExitBlocks(ExitBlocks);
>>> -  Loop *ParentL = L->getParentLoop();
>>> -  bool AllExitsAreInsideParentLoop = !ParentL ||
>>> -      std::all_of(ExitBlocks.begin(), ExitBlocks.end(),
>>> -                  [&](BasicBlock *BB) { return ParentL->contains(BB); });
>>> +
>>> +  // Go through all exits of L and see if there are any phi-nodes there. We just
>>> +  // conservatively assume that they're inserted to preserve LCSSA form, which
>>> +  // means that complete unrolling might break this form. We need to either fix
>>> +  // it in-place after the transformation, or entirely rebuild LCSSA. TODO: For
>>> +  // now we just recompute LCSSA for the outer loop, but it should be possible
>>> +  // to fix it in-place.
>>> +  bool NeedToFixLCSSA = CompletelyUnroll &&
>>> +      std::any_of(ExitBlocks.begin(), ExitBlocks.end(),
>>> +                  [&](BasicBlock *BB) { return isa<PHINode>(BB->begin()); });
>>> 
>>>   // We assume a run-time trip count if the compiler cannot
>>>   // figure out the loop trip count and the unroll-runtime
>>> @@ -308,6 +314,7 @@
>>>   LoopBlocksDFS::RPOIterator BlockBegin = DFS.beginRPO();
>>>   LoopBlocksDFS::RPOIterator BlockEnd = DFS.endRPO();
>>> 
>>> +  std::vector<BasicBlock*> UnrolledLoopBlocks = L->getBlocks();
>>>   for (unsigned It = 1; It != Count; ++It) {
>>>     std::vector<BasicBlock*> NewBlocks;
>>>     SmallDenseMap<const Loop *, Loop *, 4> NewLoops;
>>> @@ -387,6 +394,7 @@
>>>         Latches.push_back(New);
>>> 
>>>       NewBlocks.push_back(New);
>>> +      UnrolledLoopBlocks.push_back(New);
>>>     }
>>> 
>>>     // Remap all instructions in the most recent iteration
>>> @@ -476,8 +484,12 @@
>>>     if (Term->isUnconditional()) {
>>>       BasicBlock *Dest = Term->getSuccessor(0);
>>>       if (BasicBlock *Fold = FoldBlockIntoPredecessor(Dest, LI, SE,
>>> -                                                      ForgottenLoops))
>>> +                                                      ForgottenLoops)) {
>>>         std::replace(Latches.begin(), Latches.end(), Dest, Fold);
>>> +        UnrolledLoopBlocks.erase(std::remove(UnrolledLoopBlocks.begin(),
>>> +                                             UnrolledLoopBlocks.end(), Dest),
>>> +                                 UnrolledLoopBlocks.end());
>>> +      }
>>>     }
>>>   }
>>> 
>>> @@ -530,15 +542,38 @@
>>>   if (CompletelyUnroll)
>>>     LI->markAsRemoved(L);
>>> 
>>> +  // After complete unrolling most of the blocks should be contained in OuterL.
>>> +  // However, some of them might happen to be out of OuterL (e.g. if they
>>> +  // precede a loop exit). In this case we might need to insert PHI nodes in
>>> +  // order to preserve LCSSA form.
>>> +  // We don't need to check this if we already know that we need to fix LCSSA
>>> +  // form.
>>> +  // TODO: For now we just recompute LCSSA for the outer loop in this case, but
>>> +  // it should be possible to fix it in-place.
>>> +  if (OuterL && CompletelyUnroll && !NeedToFixLCSSA) {
>>> +    for (unsigned i = 0; i < UnrolledLoopBlocks.size(); ++i) {
>>> +      BasicBlock *BB = UnrolledLoopBlocks[i];
>>> +      if (LI->getLoopFor(BB) == OuterL)
>>> +        continue;
>>> +      for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ++I) {
>>> +        for (unsigned OpI = 0, OpE = I->getNumOperands(); OpI < OpE; ++OpI) {
>>> +          if (auto Def = dyn_cast<Instruction>(I->getOperand(OpI)))
>> 
>> I think you can use ranged for for these loops:
>> 
>>      for (Instruction &Inst : BB) {
>>        for (Use &U : BB.operands()) {
>>          if (auto Def = dyn_cast<Instruction>(U))
>>> Good point. Probably, even all_of/any_of would work here, I’ll fix this.
Fixed by factoring out this part to a separate function.

>> 
>>> +            if (LI->getLoopFor(Def->getParent()) == OuterL)
>>> +              NeedToFixLCSSA = true;
>> 
>> We should be able to break out early here, no?
> Definitely, thanks!
Ditto.

> 
>> 
>>> +        }
>>> +      }
>>> +    }
>>> +  }
>>> +
>>>   // If we have a pass and a DominatorTree we should re-simplify impacted loops
>>>   // to ensure subsequent analyses can rely on this form. We want to simplify
>>>   // at least one layer outside of the loop that was unrolled so that any
>>>   // changes to the parent loop exposed by the unrolling are considered.
>>>   if (DT) {
>>>     if (!OuterL && !CompletelyUnroll)
>>>       OuterL = L;
>>>     if (OuterL) {
>>> -      bool Simplified = simplifyLoop(OuterL, DT, LI, SE, AC, PreserveLCSSA);
>>> +      simplifyLoop(OuterL, DT, LI, SE, AC, PreserveLCSSA);
>> 
>> Can simplifyLoop return false when NeedToFixLCSSA = true? If so, do we
>> still need to fix LCSSA?
> Previously I, for some reason, thought that simplification might break LCSSA, so we have to rebuild it after it, but this time I haven’t found such a case. I’ll examine what we can do there and update the patch accordingly if needed.
I just noticed that there is a `PreserveLCSSA` argument, so we should be preserving LCSSA here (modulo bugs, which we'll fix if they are exposed by this change). I also added some code to honor `PreserveLCSSA` in my changes as well.

>> 
>>> 
>>>       // LCSSA must be performed on the outermost affected loop. The unrolled
>>>       // loop's last loop latch is guaranteed to be in the outermost loop after
>>> @@ -548,7 +583,7 @@
>>>         while (OuterL->getParentLoop() != LatchLoop)
>>>           OuterL = OuterL->getParentLoop();
>>> 
>>> -      if (CompletelyUnroll && (!AllExitsAreInsideParentLoop || Simplified))
>>> +      if (CompletelyUnroll && NeedToFixLCSSA)
>> 
>> CompletelyUnroll is kind of redundant here, since there's no path where
>> we set NeedToFixLCSSA = true where we didn't check CompletelyUnroll
>> already.
> True, will fix!
Fixed.

> 
>> 
>>>         formLCSSARecursively(*OuterL, *DT, LI, SE);
>>>       else
>>>         assert(OuterL->isLCSSAForm(*DT) &&
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160203/c392f486/attachment.html>


More information about the llvm-commits mailing list