[PATCH] D28073: Preserve domtree and loop-simplify for runtime unrolling.

Fri Jan 6 14:58:18 PST 2017

mzolotukhin added inline comments.

================
Comment at: lib/Transforms/Utils/LoopUnroll.cpp:55
+static cl::opt<bool>
+UnrollVerifyDomtree("unroll-verify-domtree", cl::init(false), cl::Hidden,
+                    cl::desc("Verify domtree after unrolling"));
----------------
efriedma wrote:
> mzolotukhin wrote:
> > mkuper wrote:
> > > efriedma wrote:
> > > > mkuper wrote:
> > > > > We want to enforce this being on in new tests too, I assume?
> > > > We probably want this on in new tests, yes, but I have no idea how we would enforce it.  
> > > In code review, I meant. :-)
> > Can we make the default value `true` for debug builds (it will match the existing behavior)?
> Okay.  (I initially didn't do this because it's O(N^2) in theory, but it turns out it doesn't really matter that much.)
Thanks! Also, if you change it to `true`, then changes in the tests are no longer needed.

================
Comment at: lib/Transforms/Utils/LoopUnroll.cpp:619
+      if (BB == LatchBlock) {
+        // The latch is special because we emit unconditional branches in
+        // some cases.  Since the latch is always at the bottom of the loop,
----------------
efriedma wrote:
> mzolotukhin wrote:
> > mkuper wrote:
> > > efriedma wrote:
> > > > mkuper wrote:
> > > > > Isn't it the case that when the latch ends with an unconditional branch, that branch is towards the header block?
> > > > > If so, we should not have any children outside the loop.
> > > > Maybe this is more clear?  "The latch is special because we can emit an unconditional branch in the unrolled loop even if the original latch block ends in a conditional branch."
> > > Ah, ok, this makes sense.
> > > This still looks a bit weird to me, though. Any chance you can give an example of when this ends up different than the nearest common dominator with Latches[0]?
> > > 
> > > (Feel free to ignore me - I'm not sure I understand this enough to LGTM it anyway. :-) )
> > > 
> > I'm also curious about an example.
> > 
> > Also, this part is not clear to me: "Since the latch is always at the bottom of the loop, new dominator must also be a latch." Why can't a mid-block from the body be a dominator of a cloned latch?
> > 
> This shows up for example, if you completely unroll a single-block loop with a constant trip count; the immediate dominator of the exit is the last iteration of the loop, not the first iteration.  (foldBlockIntoPredecessor hides this problem because the block in question gets folded away when a loop is completely unrolled.)
> 
> > Why can't a mid-block from the body be a dominator of a cloned latch?
> 
> I'm not sure what you're asking here.  ChildrenToUpdate is the set of loop exit blocks which are dominated by BB, so this code only touches the idom for blocks outside the loop.  The latch's idom isn't relevant.
I've just understood what the problem is, thanks! Yes, it all makes sense now.

================
Comment at: lib/Transforms/Utils/LoopUnroll.cpp:625-631
+        for (BasicBlock *IterLatch : Latches) {
+          TerminatorInst *Term = IterLatch->getTerminator();
+          if (isa<BranchInst>(Term) && cast<BranchInst>(Term)->isConditional()) {
+            NewIDom = IterLatch;
+            break;
+          }
+        }
----------------
I think we should be able to tell which latch ends with a conditional. It should be either the first one (if all latches end with conditional branches), or the last one. Does it make sense? If so, can we remove this loop?

================
Comment at: lib/Transforms/Utils/LoopUnroll.cpp:700-701
   if (DT) {
-    if (!OuterL && !CompletelyUnroll)
-      OuterL = L;
     if (OuterL) {
----------------
efriedma wrote:
> mzolotukhin wrote:
> > Why can we remove this?
> There are three possibilities for unrolling: we could be completely unrolling a loop, we could be runtime unrolling a loop, or we could be peeling a loop.  If we're completely unrolling, these lines are a no-op.  If we're runtime unrolling, we now correctly preserve LoopSimplify for L (and we don't break it for any loop outside of L).  If we're peeling, we explicitly recreate LoopSimplify earlier.  
But isn't it also used for LCSSA? Maybe I'm missing something, but it's still not obvious to me that we can remove it.

Repository:
  rL LLVM

https://reviews.llvm.org/D28073