[PATCH] D116123: [VPlan] Handle IV vector splat as VPInstruction.

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 27 08:01:01 PST 2021


fhahn added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2729
   // Create the vector values from the scalar IV, in the absence of creating a
   // vector IV.
   auto CreateSplatIV = [&](Value *ScalarIV, Value *Step) {
----------------
Ayal wrote:
> CreateSplatIV() will now be called when unrolling only, i.e., when VF==1, so it creates and stores **scalar** values w/o actually **Splat**ting into vectors. Update comment and name of lambda?
I inlined the lambda, as it is a single use now. I also inlined the scalar path from `getStepVector`, which seems like it should have never been there in the first place.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2774
   // create the phi node, we will splat the scalar induction variable in each
   // loop iteration.
   if (!shouldScalarizeInstruction(EntryVal)) {
----------------
Ayal wrote:
> 2nd "If we can't ..." sentence is obsolete?
removed


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2788
   // vectorised IV. Except when we tail-fold, then the splat IV feeds the
   // predicate used by the masked loads/stores.
   Value *ScalarIV = CreateScalarIV(Step);
----------------
Ayal wrote:
> 2nd "Except ..." sentence is obsolete?
removed


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2790
   Value *ScalarIV = CreateScalarIV(Step);
-  if (!Cost->isScalarEpilogueAllowed())
-    CreateSplatIV(ScalarIV, Step);
   buildScalarSteps(ScalarIV, Step, EntryVal, ID, Def, State);
 }
----------------
Ayal wrote:
> This last case which only emits a scalar IV can now be folded with the end of the previous case which emits both?
Done!


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8419
     VPValue *IV = nullptr;
-    if (Legal->getPrimaryInduction())
-      IV = Plan->getOrAddVPValue(Legal->getPrimaryInduction());
-    else {
+    if (!Legal->getPrimaryInduction()) {
       VPBasicBlock *HeaderVPBB = Plan->getEntry()->getEntryBasicBlock();
----------------
Ayal wrote:
> Could we simplify by always creating a VPWidenCanonicalIVRecipe, w/o trying to use Legal->getPrimaryInduction() even when it exists?
> 
> At first FoldTail relied solely on PrimaryInduction, until D77635 extended it to work w/o PrimaryInduction; but it should probably have removed the dependence on PrimaryInduction altogether. This may also simplify/revert D92017. A subsequent optimization could try to fold multiple IV recipes, in case both the PrimaryInduction was widened and WidenCanonicalIV was created (or leave it to LSR).
> Could we simplify by always creating a VPWidenCanonicalIVRecipe, w/o trying to use Legal->getPrimaryInduction() even when it exists?

We could do, but it would probably mean a bigger test diff and a bigger functional change. I'd prefer to do it separately, to avoid any unexpected knock-on effects.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8432
     VPValue *BTC = Plan->getOrCreateBackedgeTakenCount();
     bool TailFolded = !CM.isScalarEpilogueAllowed();
 
----------------
Ayal wrote:
> TailFolded should always be true? (While we're here, irrespective of this patch)
Yes, I noticed that as well. I'll push a commit, replacing it with an assert.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8442
+        IV = Plan->getOrAddVPValue(Legal->getPrimaryInduction());
+      }
       BlockMask = Builder.createNaryOp(VPInstruction::ActiveLaneMask, {IV});
----------------
Ayal wrote:
> No need for {}
> 
> So a StepVector is needed only to feed ICmpULE, not ActiveLaneMask which depends on first lane only? Seems like an optimization best applied separately?
> 
> BTC is no longer used by ActiveLaneMask, so setting it should sink below to where ICmpULE needs it (While we're here, irrespective of this patch).
> No need for {}

I think if the body spans multiple lines due to a comment, braces should not be omitted
https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements

> So a StepVector is needed only to feed ICmpULE, not ActiveLaneMask which depends on first lane only? Seems like an optimization best applied separately?

Yes, let me try to split that off. `ActiveLaneMask` only needs the first scalar step of each `Part`.

> BTC is no longer used by ActiveLaneMask, so setting it should sink below to where ICmpULE needs it (While we're here, irrespective of this patch).

done separately in 2e630eabd329


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8454
+          auto *R =
+              new VPInstruction(VPInstruction::StepVector, {CanIV, OneVPV});
+          Builder.getInsertBlock()->insert(R, NewInsertionPoint);
----------------
Ayal wrote:
> StepVector needs to hold One as its second operand, mainly to convey the desired type?
At the moment only a step of one is used, but it will be used with different steps in the future (build scalar steps for inductions with arbitrary steps.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8849
+        return VF.isVector() && ShouldScalarizeInstruction(Instr, VF) &&
+               NeedsScalarInduction(VF);
+      },
----------------
Ayal wrote:
> NeedsScalarInduction(VF) will always early-return true given that ShouldScalarizeInstruction(Instr, VF) is true?
I think it is required for the case where `Instr` is a truncate.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9175
+  // Replace StepVector recipes of the canonical induction by the widened vector
+  // induction, if it exists.
+  if (auto *CanonicalStepVector =
----------------
Ayal wrote:
> As a redundancy elimination optimization?
> This buildVPlanWithVPRecipes() method deserves outlining anything from it.
Sounds good, moved to ` VPlanTransforms::removeRedundantStepVector`


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:741
+    Value *Step = State.get(getOperand(1), VPIteration(0, 0));
+    Value *Val = State.get(getOperand(0), VPIteration(0, 0));
+    Value *Broadcasted =
----------------
Ayal wrote:
> get operands in order?
> Store singleton scalar values per part 0, instead of per lane 0?
Changed to get `VPIteration(Part, 0)`. I kept accessing `Val`  outside the loop, to keep a single broadcast. But there's now an assert.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:746
+    for (unsigned Part = 0; Part < State.UF; ++Part) {
+      assert(!State.VF.isScalable() && "scalable vectors not yet supported.");
+      Value *StartIdx;
----------------
Ayal wrote:
> assert at the outset? (While we're here, irrespective of this patch)
Move out of the loop and will push a fix separately.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:856
+  }
+
   unsigned getOpcode() const { return Opcode; }
----------------
Ayal wrote:
> Where are these new classof's needed?
Unfortunately this is needed to `dyn_cast` directly from VPUser -> VPInstruction


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116123/new/

https://reviews.llvm.org/D116123



More information about the llvm-commits mailing list