[PATCH] D114917: [LoopInterchange] Enable loop interchange with multiple inner loop indvars

Tue Dec 7 06:49:57 PST 2021

congzhe added inline comments.

================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:890
   }

   return false;
----------------
The bulky removal starting from line 891 is from patch https://reviews.llvm.org/D115238.

It would be easier to just incorporate D115238 in this patch, meaning this patch will be rebased on top of D115238.

================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:647
 bool LoopInterchangeLegality::isLoopStructureUnderstood(
-    PHINode *InnerInduction) {
-  unsigned Num = InnerInduction->getNumOperands();
+    const SmallVectorImpl<PHINode *> &Inductions) {
   BasicBlock *InnerLoopPreheader = InnerLoop->getLoopPreheader();
----------------
bmahjour wrote:
> suggestion: Inductions ->InnerInductions
Thank you for the comment, the code is now refactored and this argument is not needed.
We'll use the member variable `LoopInterchangeLegality::InnerLoopInductions` directly in `LoopInterchangeLegality::isLoopStructureUnderstood()`.

================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:696
     std::function<bool(Value *)> IsPathToIndVar;
-    IsPathToIndVar = [&InnerInduction, &IsPathToIndVar](Value *V) -> bool {
-      if (V == InnerInduction)
+    IsPathToIndVar = [&Inductions, &IsPathToIndVar](Value *V) -> bool {
+      if (std::find(Inductions.begin(), Inductions.end(), V) !=
----------------
bmahjour wrote:
> suggestion: `Value *V` -> `const Value *V`
Thanks, updated accordingly.

================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:697-698
+    IsPathToIndVar = [&Inductions, &IsPathToIndVar](Value *V) -> bool {
+      if (std::find(Inductions.begin(), Inductions.end(), V) !=
+          Inductions.end())
         return true;
----------------
bmahjour wrote:
> suggestion: use `llvm::is_contained()`
Thanks, updated accordingly.

================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:716
+    if (IsPathToIndVar(Op0) && IsPathToIndVar(Op1))
+      return true;
+
----------------
bmahjour wrote:
> What if we have a loop condition like this:
> ```
> for (...)
>   for (; i < j; i++, j--)
> ```
> 
> I'm not sure if this would be legal to interchange (need to think about it more). I would have expected we return false if both left and right are functions of inner-ivs, to be conservatively correct.
> 
> Should have a test for this case also.
Thank you, after consideration I would propose this loop can be interchanged. I added a corresponding test case (test3 in `/Transforms/LoopInterchange/interchangeable-innerloop-multiple-indvars.ll`) that shows what the loop looks like after transformation. 

Following the transformation steps in the pass, I would not think this loop is much of a difference -- the phi nodes of `i` and `j` will be split out from the inner header (and formed into the new outer header) and the indvar increments will be split out from the inner latch (and formed into the new outer latch).

================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:892

-  // TODO: Current limitation: Since we split the inner loop latch at the point
-  // were induction variable is incremented (induction.next); We cannot have
-  // more than 1 user of induction.next since it would result in broken code
-  // after split.
-  // e.g.
-  // for(i=0;i<N;i++) {
-  //    for(j = 0;j<M;j++) {
-  //      A[j+1][i+2] = A[j][i]+k;
-  //  }
-  // }
-  Instruction *InnerIndexVarInc = nullptr;
-  if (InnerInductionVar->getIncomingBlock(0) == InnerLoopPreHeader)
-    InnerIndexVarInc =
-        dyn_cast<Instruction>(InnerInductionVar->getIncomingValue(1));
-  else
-    InnerIndexVarInc =
-        dyn_cast<Instruction>(InnerInductionVar->getIncomingValue(0));
-
-  if (!InnerIndexVarInc) {
-    LLVM_DEBUG(
-        dbgs() << "Did not find an instruction to increment the induction "
-               << "variable.\n");
-    ORE->emit([&]() {
-      return OptimizationRemarkMissed(DEBUG_TYPE, "NoIncrementInInner",
-                                      InnerLoop->getStartLoc(),
-                                      InnerLoop->getHeader())
-             << "The inner loop does not increment the induction variable.";
-    });
-    return true;
+  InnerLoopInductions = Inductions;
+  SmallPtrSet<Instruction *, 8> InnerIndexVarIncs;
----------------
bmahjour wrote:
> why do we need this copy?
Thanks, `InnerLoopInductions` will be later used in the transformation phase, thus we keep a copy in class `LoopInterchangeLegality` and will use it later.

================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:1401
+              UserI->getParent() == NewLatch ||
+              std::find(InductionPHIs.begin(), InductionPHIs.end(), UserI) !=
+                   InductionPHIs.end())
----------------
bmahjour wrote:
> use `llvm::is_contained()`
Thanks, udpate accordingly.

================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:1411
               this->LI->getLoopFor(OpI->getParent()) != this->InnerLoop ||
-              OpI == InductionPHI)
+              std::find(InductionPHIs.begin(), InductionPHIs.end(), OpI) !=
+                  InductionPHIs.end())
----------------
bmahjour wrote:
> use `llvm::is_contained()`
Thanks, udpate accordingly.

================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:1729
+      continue;
+    InnerLoopPHIs.push_back(cast<PHINode>(&PHI));
+  }
----------------
bmahjour wrote:
> Why did we need to check `OuterInnerReductions` before and not anymore? 
Thanks, this is due to difference in downstream and upstream repos. I've updated as per upstream.

================
Comment at: llvm/test/Transforms/LoopInterchange/interchangeable-innerloop-multiple-indvars.ll:16
+;;     e = 5;
+;;     for (; d, e; d--, e--)
+;;       a |= b[d][c + 9];
----------------
bmahjour wrote:
> The loop latch in the IR only uses one IV anyway, so can we simplify this to only use one IV for the condition? 
> Please also add a test where more than one IV is used in the latch.
Thanks, added what you suggested as test2() where all inner loop indvars are used in the inner latch.

I kept test1() as is, although we could simplify it to use one IV it would be serve as an example of multiple IVs anymore if I simplified it.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114917/new/

https://reviews.llvm.org/D114917