[PATCH] D34200: [PM/unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching, making it no longer even remotely simple.

Wed Jul 12 22:16:46 PDT 2017

sanjoy requested changes to this revision.
sanjoy added a comment.
This revision now requires changes to proceed.

Second and last part of the review.

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1907
+
+  if (UnswitchCost < UnswitchThreshold) {
+    DEBUG(dbgs() << "  Trying to unswitch non-trivial (cost = " << UnswitchCost
----------------
chandlerc wrote:
> sanjoy wrote:
> > How about putting all of the logic up to this point into a `TerminatorInst *findNonTrivialUnswitchCandidate` function?
> I like factoring something like this... Question though, is it useful to keep the call to actually do the unswitch separate? Happy to look at the code and try some splits / arrangements to make it a bit more manageable, I'm definitely not (yet) happy w/ the factoring.
> 
> Question though, is it useful to keep the call to actually do the unswitch separate?

No strong opinion on that.

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:95
+/// FIXME: Longer term, many uses of this can be replaced by an incremental
+/// domtree update strategy that starts from a known dominating block and
+/// rebuilds that subtree.
----------------
This bit looks separable, all of the call sites to this function is in pre-existing code.

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:186
 /// dominated by some node between the unswitched one and the old preheader.
 /// All of these also need to be hoisted in the dominator tree. We also want to
 /// minimize queries to the dominator tree because each step of this
----------------
This bit looks separable to me -- no new call sites were introduced in this patch.

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1103
+      return false;
+
+    delete ChildL;
----------------
I'd add an assert here that if `ChildL` 's header is dead, so are all other blocks in `ChildL`.

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1133
+/// of the newly sibling loops following it.
+static bool rebuildLoopAfterUnswitch(Loop &L, ArrayRef<BasicBlock *> ExitBlocks,
+                                     LoopInfo &LI,
----------------
Document the return value.

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1263
+          : std::stable_partition(
+                Blocks.begin(), Blocks.end(),
+                [&](BasicBlock *BB) { return LoopBlockSet.count(BB); });
----------------
Why do you care about a stable partition here?

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1267
+  // Before we erase the list of unlooped blocks, build a set of them.
+  SmallPtrSet<BasicBlock *, 16> UnloopedSet(BlocksSplitI, Blocks.end());
+  if (LoopBlockSet.empty())
----------------
I'd s/UnloopedSet/UnloopedBlocks/ (I think the `Blocks` suffix adds more information than the `Set` suffix).

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1280
+                   [&](BasicBlock *LHS, BasicBlock *RHS) {
+                     return LI.getLoopDepth(LHS) < LI.getLoopDepth(RHS);
+                   });
----------------
Why do we care about the sort being stable?

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1284
+  // We'll build up a set for each exit loop.
+  SmallPtrSet<BasicBlock *, 16> ExitLoopSet;
+  Loop *PrevExitL = L.getParentLoop(); // The first possible exit loop.
----------------
s/ExitLoopSet/NewExitLoopBlocks/

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1285
+  SmallPtrSet<BasicBlock *, 16> ExitLoopSet;
+  Loop *PrevExitL = L.getParentLoop(); // The first possible exit loop.
+
----------------
I'd s/first/deepest/ in the comment.

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1287
+
+  auto RemoveUnloopedBlocksFromLoop = [&](Loop &L) {
+    for (auto *BB : UnloopedSet)
----------------
There already is a variable named `L` in scope, so please rename the argument to something else.

Perhaps clang should warn about this? :)

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1298
+
+    BasicBlock *ExitBB = ExitsInLoops.pop_back_val();
+    Loop &ExitL = *LI.getLoopFor(ExitBB);
----------------
Can you add a comment stating something like "get the next exit block, in decreasing loop depth order"?  You already have mentioned this in the `std::sort` call, but there are enough moving parts here that some redundancy can help.

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1303
+    // Erase all of the unlooped blocks from the loops between the previous
+    // exit loop and this exit loop.
+    for (; PrevExitL != &ExitL; PrevExitL = PrevExitL->getParentLoop())
----------------
Add a comment here that this only works because we sorted `ExitsInLoops` to be in increasing order of loop depth.

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1314
+      if (BB == PH)
+        continue;
+
----------------
When would we ever get to the preheader?

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1343
+      if (Loop *BBL = LI.getLoopFor(BB))
+        if (BBL == &L || !L.contains(BBL))
+          LI.changeLoopFor(BB, &ExitL);
----------------
I don't see when the `!L.contains(BBL)` bit will kick in -- won't all of the blocks in `ExitLoopSet` either be exit blocks of the original L, or previously interior blocks that are no longer interior?

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1357
+    if (Loop *BBL = LI.getLoopFor(BB))
+      if (BBL == &L || !L.contains(BBL))
+        LI.changeLoopFor(BB, nullptr);
----------------
Again, I can't see when `!L.contains(BBL)` will kick in.

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1589
+    // accurate, we want sub-trees within the original loop to be
+    // conservatively correct and that requires moving the merge block out of
+    // that subtree.
----------------
Define the direction of "conservative" here -- does it mean "A conservative-dominates B implies A actually-dominates B" or the converse?

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1640
+          cast_or_null<BasicBlock>(VMap.lookup(ContinueSuccBB)))
+    ClonedContinueSuccBB->removePredecessor(ClonedParentBB,
+                                            /*DontDeleteUselessPHIs*/ true);
----------------
Why is `DontDeleteUselessPHIs` false?

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1696
+
+  // We can change which blocks are exit blocks of all the cloned sibling
+  // loops, the current loop, and any parent loops which shared exit blocks
----------------
When you say "cloned sibling loop", do you include all of the loops that used to be subloops of L and have now been reparented?  I can't think of a case in which the exit block set for those will change.

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1719
+  // and we can do it in any order as they don't nest relative to each other.
+  for (Loop *UpdatedL : llvm::concat<Loop *>(NonChildClonedLoops, HoistedLoops))
+    UpdateLCSSA(*UpdatedL);
----------------
As I said above, I'm not convinced that we need this bit.

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1726
+    SmallVector<Loop *, 4> OuterLoops;
+    // The top of our stack will be the cloned loop and the current loop if
+    // they are loops. ALso, if either the cloned loop or the current loop have
----------------
Instead of commenting on what's at the top of the stack, how about phrasing the comment as on the order in which we will process the loops?

s/ALso/Also/

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1740
+    // Grab all of the enclusing loops now.
+    for (Loop *OuterL = ParentL; OuterL != OuterExitL;
+         OuterL = OuterL->getParentLoop())
----------------
s/enclusing/enclosing/

================
Comment at: lib/Transforms/Scalar/SimpleLoopUnswitch.cpp:1761
+#ifndef NDEBUG
+  // Verify the entiry loop structure to catch any incorrect updates before we
+  // progress in the pass pipeline.
----------------
s/entiry/entire/

https://reviews.llvm.org/D34200