[PATCH] D116123: [VPlan] Handle IV vector splat as VPInstruction.

Sun Dec 26 09:51:39 PST 2021

Ayal added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2729
   // Create the vector values from the scalar IV, in the absence of creating a
   // vector IV.
   auto CreateSplatIV = [&](Value *ScalarIV, Value *Step) {
----------------
CreateSplatIV() will now be called when unrolling only, i.e., when VF==1, so it creates and stores **scalar** values w/o actually **Splat**ting into vectors. Update comment and name of lambda?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2774
   // create the phi node, we will splat the scalar induction variable in each
   // loop iteration.
   if (!shouldScalarizeInstruction(EntryVal)) {
----------------
2nd "If we can't ..." sentence is obsolete?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2788
   // vectorised IV. Except when we tail-fold, then the splat IV feeds the
   // predicate used by the masked loads/stores.
   Value *ScalarIV = CreateScalarIV(Step);
----------------
2nd "Except ..." sentence is obsolete?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2790
   Value *ScalarIV = CreateScalarIV(Step);
-  if (!Cost->isScalarEpilogueAllowed())
-    CreateSplatIV(ScalarIV, Step);
   buildScalarSteps(ScalarIV, Step, EntryVal, ID, Def, State);
 }
----------------
This last case which only emits a scalar IV can now be folded with the end of the previous case which emits both?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8419
     VPValue *IV = nullptr;
-    if (Legal->getPrimaryInduction())
-      IV = Plan->getOrAddVPValue(Legal->getPrimaryInduction());
-    else {
+    if (!Legal->getPrimaryInduction()) {
       VPBasicBlock *HeaderVPBB = Plan->getEntry()->getEntryBasicBlock();
----------------
Could we simplify by always creating a VPWidenCanonicalIVRecipe, w/o trying to use Legal->getPrimaryInduction() even when it exists?

At first FoldTail relied solely on PrimaryInduction, until D77635 extended it to work w/o PrimaryInduction; but it should probably have removed the dependence on PrimaryInduction altogether. This may also simplify/revert D92017. A subsequent optimization could try to fold multiple IV recipes, in case both the PrimaryInduction was widened and WidenCanonicalIV was created (or leave it to LSR).

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8432
     VPValue *BTC = Plan->getOrCreateBackedgeTakenCount();
     bool TailFolded = !CM.isScalarEpilogueAllowed();

----------------
TailFolded should always be true? (While we're here, irrespective of this patch)

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8442
+        IV = Plan->getOrAddVPValue(Legal->getPrimaryInduction());
+      }
       BlockMask = Builder.createNaryOp(VPInstruction::ActiveLaneMask, {IV});
----------------
No need for {}

So a StepVector is needed only to feed ICmpULE, not ActiveLaneMask which depends on first lane only? Seems like an optimization best applied separately?

BTC is no longer used by ActiveLaneMask, so setting it should sink below to where ICmpULE needs it (While we're here, irrespective of this patch).

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8454
+          auto *R =
+              new VPInstruction(VPInstruction::StepVector, {CanIV, OneVPV});
+          Builder.getInsertBlock()->insert(R, NewInsertionPoint);
----------------
StepVector needs to hold One as its second operand, mainly to convey the desired type?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8849
+        return VF.isVector() && ShouldScalarizeInstruction(Instr, VF) &&
+               NeedsScalarInduction(VF);
+      },
----------------
NeedsScalarInduction(VF) will always early-return true given that ShouldScalarizeInstruction(Instr, VF) is true?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9175
+  // Replace StepVector recipes of the canonical induction by the widened vector
+  // induction, if it exists.
+  if (auto *CanonicalStepVector =
----------------
As a redundancy elimination optimization?
This buildVPlanWithVPRecipes() method deserves outlining anything from it.

================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:741
+    Value *Step = State.get(getOperand(1), VPIteration(0, 0));
+    Value *Val = State.get(getOperand(0), VPIteration(0, 0));
+    Value *Broadcasted =
----------------
get operands in order?
Store singleton scalar values per part 0, instead of per lane 0?

================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:746
+    for (unsigned Part = 0; Part < State.UF; ++Part) {
+      assert(!State.VF.isScalable() && "scalable vectors not yet supported.");
+      Value *StartIdx;
----------------
assert at the outset? (While we're here, irrespective of this patch)

================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:753
+        StartIdx =
+            getRuntimeVF(State.Builder, Step->getType(), State.VF * Part);
+
----------------
Wrap this inside a common interface that checks the type of step and returns the desired Value? (ditto)

================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:856
+  }
+
   unsigned getOpcode() const { return Opcode; }
----------------
Where are these new classof's needed?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116123/new/

https://reviews.llvm.org/D116123