[PATCH] D22630: Loop rotation

Wed Dec 14 15:18:47 PST 2016

mzolotukhin requested changes to this revision.
mzolotukhin added a comment.
This revision now requires changes to proceed.

Hi,

Sorry for the delay, please find some comments inline.

Many of them are about the tests - I still feel uneasy about bugpoint generated tests. It's hard to tell what they're checking and it would be almost impossible to say if they are relevant or not after a couple of changes in the code. Can you correlate the place in the code with every test (e.g. if we comment out check A, then test A starts to fail)? If so, I'm pretty sure that you can reduce corresponding tests even more (in fact, it's often better to actually write them from scratch and use the bugpoint-generated programs just as a hint of what properties the test should have). If you can not, then I think there is no value in such tests. I also think that some of the tests are actually identical. I'm might sound picky here, but if we don't write good tests now, we'll end up with not-so-good tests in future.

Thanks,
Michael

================
Comment at: llvm/include/llvm/Analysis/LoopInfo.h:444
   /// 0 is returned.
-  MDNode *getLoopID() const;
+  std::pair<MDNode *, Instruction *> getLoopIDWithInstr() const;
+  MDNode *getLoopID() const { return getLoopIDWithInstr().first; }
----------------
mzolotukhin wrote:
> This needs to be documented.
This still needs to be documented. It might be unclear what instruction we return here.

================
Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:149
+  const BasicBlock *OrigHeader = L->getHeader();
   if (!L->isLoopExiting(OrigHeader))
     return false;
----------------
hiraditya wrote:
> eli.friedman wrote:
> > Did you mean to change this check? Given the way you've implemented cloning, you should be able to do something more aggressive.
> The new implementation can rotate the loop as far as user wants. This check is here because currently we copy until we find a basic block which is not exiting. I guess if there are use cases where a very general copying is helpful, we can remove this check.
I think this check is not needed now. What we want to look at is the loop edge (and we do it just before this check) - loop rotation should make it exiting. But even if you want to keep this check, the comment needs to be updated as it was written for the limitations of the old implementation.

================
Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:194
+
+    // TODO: Once this is crossed we can copy until the prev. BB.
+    if (LoopSize + Metrics.NumInsts >= RotationMaxSize)
----------------
mzolotukhin wrote:
> Why is this 'TODO'? Also, I don't see a need in shortening the word 'previous' here.
Why is this called 'TODO'?

================
Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:403-404
+
+  BasicBlock *OrigH = L->getHeader();
+  BasicBlock *LoopLatch = L->getLoopLatch();
+
----------------
hiraditya wrote:
> mzolotukhin wrote:
> > We definitely want to have some thresholds, like we do now - maybe just rename the function somehow? `ProfitableToRotate` sounds to me like there would be a benefit right from loop-rotate, while in reality we just hope that a later pass would take advantage of it. As I see it, we want to rotate as many loops as possible, being limited only by thresholds, and not considering profitability of the rotation itself. BTW, are we in agreement on this?
> The function 'isProfitableToRotate' only checks if the code-size increase would be under a budget. So I can rename the function maybe?
> 
Yes, please rename it.

================
Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:12
+// 1. Canonicalize loop latch to have only one successor.
+// 2. Clone all the BBs which are exiting the loop.
+// 3. Adjust phi of cloned BBs
----------------
Don't we clone other blocks too? Exiting blocks might be a boundary for the region we clone, but this region can contain non-exiting blocks too. Is it correct?

Also, I think this comment does not need to cover implementation details at all. If we want to make it more verbose, then we should rather describe what loop rotation means, what loops are affected, and what properties we guarantee after the transformation for rotated loops.

================
Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:489
+bool LoopRotate::preserveLoopSimplifyForm(const SmallPtrSetBB &Exits) const {
+    bool HasIndirectBr = false;
+    // Split the edge from exiting BB to exit BB if not in loop-simplify form.
----------------
Wouldn't it be better to check if the loop has indirect branches and bail out before we begin doing anything?

================
Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:515
+/// Blocks[0] is the entry basic block of the SEME.
+/// TODO: arrange newly inserted bbs.
+void LoopRotate::rotateLoop(BasicBlock *NewH, const SmallVecBB &Blocks,
----------------
We're doing something to arrange new blocks below. Please either say what specifically is missing or remove this TODO in case it's actually done.

================
Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:519
+  bool LoopSimplifyForm = false;
+  DEBUG(LoopSimplifyForm = L->isLoopSimplifyForm());
+  BasicBlock *OrigH = L->getHeader();
----------------
Why is this under `DEBUG`?

================
Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:614
+  DEBUG(LI->verify(*DT));
+  DEBUG(DT->verifyDomTree());

----------------
Nitpick: please verify `DT` first (since `LI` uses it).

================
Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:663
+    // If the NewH is the loop-latch => Exits does not include the exiting block
+    // from loop-latch see (collectSEMEBlocks).
+    if (NewH == LoopLatch)
----------------
s/see (collectSEMEBlocks)/(see collectSEMEBlocks)/

================
Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:265
+        PHINode *PN = dyn_cast<PHINode>(&I);
+        if (!PN || PN->getNumOperands() != 2)
+          break;
----------------
if `getNumOperands` != 2, then we'll break from this loop and then create a phi-node with 2 predecessors, which will be invalid.

Can we assert that it equals 2?

================
Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:273
+          continue;
+        if (PN->getIncomingValue(NLVal) == Inst && PN->getIncomingValue(NPHVal) == VMap[Inst])
+          return PN;
----------------
This line seems too long.

================
Comment at: llvm/lib/Transforms/Scalar/LoopRotation.cpp:293
+    for (Instruction &Inst : *BB) {
+      // Skip Ins with no use e.g., branches.
+      if (Inst.use_begin() == Inst.use_end())
----------------
s/Ins/Inst/

================
Comment at: llvm/test/Analysis/ScalarEvolution/2012-03-26-LoadConstant.ll:1
-; RUN: opt < %s -basicaa -globalopt -instcombine -loop-rotate -licm -instcombine -indvars -loop-deletion -constmerge -S | FileCheck %s
+; RUN: opt < %s -basicaa -globalopt -instcombine -loop-rotate -licm -simplifycfg -S | FileCheck %s
+
----------------
hiraditya wrote:
> mzolotukhin wrote:
> > Why is this change needed?
> Now that the loop rotation does not simplify any instructions, we need simplifycfg post-licm to clean up the redundancies exposed by loop rotate.
This test is no longer relevant, it doesn't fail if the change it went in with is reverted.

Could you please remove it (in a separate commit) altogether?

================
Comment at: llvm/test/Transforms/LoopRotate/dbgvalue.ll:8-10
+; CHECK: tailrecurse.lr
+; CHECK: call void @llvm.dbg.value(metadata i32 %x
+; CHECK: tail call void @llvm.dbg.value(metadata i32 %y
----------------
mzolotukhin wrote:
> Why was this test changed?
The question is still relevant.

================
Comment at: llvm/test/Transforms/LoopRotate/loop-rotate-before-latch.ll:1
+; RUN: opt -S < %s  -loop-rotate -verify-dom-info -verify-loop-info -loop-rotate -stats | FileCheck %s
+
----------------
Why do we need `-stats` here? We never check them.
Why do we run `-loop-rotate` twice?

================
Comment at: llvm/test/Transforms/LoopRotate/multiple-exits.ll:6-12
+; PR7447: there should be two loops rotated.
+; CHECK-LABEL: @test1
+
+; Check that the outer loop is rotated.
+; CHECK: for.cond.lr:
+; CHECK: for.cond1.preheader.lr.ph:
+
----------------
Why do we need to move it?
The second function has CHECK-lines after its body, so we are inconsistent now.

Also, we have `pr7447.ll` test. Can we merge these files maybe?

================
Comment at: llvm/test/Transforms/LoopRotate/nosimplifylatch.ll:6-7
 ;CHECK: for.inc:
-;CHECK-NEXT: %incdec.ptr.i = getelementptr 
+;CHECK: %incdec.ptr.i = getelementptr
+;CHECK: br

----------------
mzolotukhin wrote:
> Why do we need this change?
The question is still relevant.

================
Comment at: llvm/test/Transforms/LoopRotate/phi-duplicate.ll:36
 ; CHECK:      for.body:
-; CHECK-NEXT:   %j.01 = phi i64
-; CHECK-NOT:  br
-; CHECK:   br i1 %cmp, label %for.body, label %for.end
-; CHECK:      for.end:
-; CHECK-NEXT:        ret void
+; CHECK:  %phi.nh = phi i64
+; CHECK:  %inc = add nsw i64 %phi.nh, 1
----------------
Can we add `CHECK-NOT` to make sure that no other phis are present?

================
Comment at: llvm/test/Transforms/LoopRotate/preserve-loop-simplify.ll:19
 ; CHECK: [[INNER_PREROTATE_PREHEADER]]:
-; CHECK-NEXT: br i1 {{[^,]*}}, label %[[INNER_PREROTATE_PREHEADER_SPLIT_RETURN:[^,]*]], label %[[INNER_ROTATED_PREHEADER:[^,]*]]
+; CHECK: br label %inner.header.lr
+
----------------
hiraditya wrote:
> mzolotukhin wrote:
> > Why did this test change? Shouldn't we be producing the same results on such cases?
> Thanks for pointing this out. I have added code to restore the loop-simplify form.
Can't we cancel the changes to the test then?

================
Comment at: llvm/test/Transforms/LoopRotate/simplifylatch.ll:48-60
+;CHECK: %arrayidx.lr = getelementptr inbounds i8, i8* %CurPtr, i64 %idxprom.lr
+;CHECK: %0 = load i8, i8* %arrayidx.lr, align 1
+;CHECK: %conv.lr = sext i8 %0 to i32
+;CHECK: %arrayidx1.lr = getelementptr inbounds i8, i8* %CurPtr, i64 0
+;CHECK: %1 = load i8, i8* %arrayidx1.lr, align 1
+;CHECK: %conv2.lr = sext i8 %1 to i32
+;CHECK: for.body:
----------------
If we really want to check the entire contents of the loop, then we should use `CHECK-NEXT`. However, I'm not a fan of it - we should only check what matters.

================
Comment at: llvm/test/Transforms/LoopVectorize/vect.omp.persistence.ll:7-11
 ; CHECK-NOT: LV: Loop hints: force=
-; In total only 1 loop should be rotated.
-; CHECK: 1 loop-rotate
+; checking the stats for the function @nonrotated
+; CHECK: 2 gvn{{.*}}Number of equalities propagated
+; With the new loop rotation both the loops should be rotated.
+; CHECK: 2 loop-rotate
----------------
mzolotukhin wrote:
> This test actually should probably be rewritten.
> 
> It tries to check that 
> 1) loop-rotate doesn't drop metadata,
> 2) no pass in -O2 pipeline before the vectorizer drops metadata.
> 
> The first check should be a separate test for loop-rotate, so I suggest adding it explicitly now (or in a separate commit). Then, one function from this test can be removed, and comments updates.
The test in the current form is hard to maintain, and with every change we risk that it becomes irrelevant. Could you please address it?

https://reviews.llvm.org/D22630