[PATCH] D116123: [VPlan] Handle IV vector splat as VPInstruction.

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 25 11:21:07 PST 2022


fhahn marked 4 inline comments as done.
fhahn added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:4794
     // If tail-folding is applied, the primary induction variable will be used
     // to feed a vector compare.
     if (Ind == Legal->getPrimaryInduction() && foldTailByMasking())
----------------
Ayal wrote:
> Wonder if this is still the case now that NewIV is always created to feed the vector compare, until replaced?
I think for compatibility with existing behavior we still need to retain it, as it impacts the decision whether the scalarize the induction or not. Could be investigated in a follow-up change, once all related patches are through.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:769
+      assert(Step->getType()->isIntegerTy() &&
+             "stepvector only supports integer steps for now");
+      Value *StartIdx =
----------------
Ayal wrote:
> Assert is redundant given than Step was created above as a ConstantInt - assert instead that Val is integer, before the loop?
This code is now gone in the latest version.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:1383
+    return all_of(users(), [this](VPUser *U) {
+      return cast<VPRecipeBase>(U)->onlyFirstLaneUsed(this);
+    });
----------------
Ayal wrote:
> Could a user be live-out / not a recipe?
at the moment there's not possible, all users must be a recipe (also the code is gone from this patch)


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:926
+           "Op must be an operand of the recipe");
+    return getOperand(0) == Op && getOpcode() == VPInstruction::ActiveLaneMask;
+  }
----------------
Ayal wrote:
> fhahn wrote:
> > Ayal wrote:
> > > The recipes along the canonical IV chain also use first lane only, namely: CanonicalIVIncrement, CanonicalIVIncrementNUW, BranchOnCount and VPCanonicalIVPHIRecipe? They are also "onlyFirstPartUsed()", but are unique in that.
> > Updated to include CanonicalIVIncrement, CanonicalIVIncrementNUW, BranchOnCount  opcodes.
> Also have VPCanonicalIVPHIRecipe indicate it uses only first lane?
Will update, but the code has moved to a separate patch again.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:354
     // directly.
-    if (WidenOriginalIV && WidenOriginalIV->isCanonical() &&
-        WidenOriginalIV->getScalarType() == WidenNewIV->getScalarType()) {
-      WidenNewIV->replaceAllUsesWith(WidenOriginalIV);
-      WidenNewIV->eraseFromParent();
-      return;
+    // If scalar values are demanded from WidenNewIV, create a stepvector.
+    // TODO: Remove checks once a vector phi will be generated for all
----------------
Ayal wrote:
> fhahn wrote:
> > Ayal wrote:
> > > The VPWidenCanonicalIVRecipe WidenNewIV is created only to feed the ICmpULE or ActiveLane, i.e., when vectorizing with fold tail, so can assert(!Range.Start.isScalar && FoldTail) if needed below?
> > > 
> > > Scalar values are demanded from WidenNewIV only if it feeds ActiveLane, which demands only the first lane, so broadcasting and adding building scalar steps for each lane seems redundant here? Suffice to feed ActiveLane with UF parts each holding first lane only, either constructed via a simplified StepVector from canonicalIV or perhaps have the canonicalIV supply them directly instead of broadcasting its Part 0 value across all parts (which now seems potentially misleading)?
> > > The VPWidenCanonicalIVRecipe WidenNewIV is created only to feed the ICmpULE or ActiveLane, i.e., when vectorizing with fold tail, so can assert(!Range.Start.isScalar && FoldTail) if needed below?
> > 
> > I might be missing something, but I think It is still possible to fold the tail with VF=1 and UF>1, so I left the check (e.g. in `llvm/test/Transforms/LoopVectorize/pr45679-fold-tail-by-masking.ll`)
> > 
> > 
> > > Scalar values are demanded from WidenNewIV only if it feeds ActiveLane, which demands only the first lane, so broadcasting and adding building scalar steps for each lane seems redundant here? Suffice to feed ActiveLane with UF parts each holding first lane only, either constructed via a simplified StepVector from canonicalIV or perhaps have the canonicalIV supply them directly instead of broadcasting its Part 0 value across all parts (which now seems potentially misleading)?
> > 
> > Yes, for ActiveLane we only need the first scalar part of each lane. I think it might be better to have a special version of build-scalar-steps that only generates the first lane as follow-up to the build-scalar-steps patches.
> >> The VPWidenCanonicalIVRecipe WidenNewIV is created only to feed the ICmpULE or ActiveLane, i.e., when vectorizing with fold tail, so can assert(!Range.Start.isScalar && FoldTail) if needed below?
> 
> > I might be missing something, but I think It is still possible to fold the tail with VF=1 and UF>1, so I left the check (e.g. in llvm/test/Transforms/LoopVectorize/pr45679-fold-tail-by-masking.ll)
> 
> Ah, you're right, I forgot foldTail may operate when unrolling only; it should probably stop doing that, but in a separate patch.
> 
> >> Scalar values are demanded from WidenNewIV only if it feeds ActiveLane, which demands only the first lane, so broadcasting and adding building scalar steps for each lane seems redundant here? Suffice to feed ActiveLane with UF parts each holding first lane only, either constructed via a simplified StepVector from canonicalIV or perhaps have the canonicalIV supply them directly instead of broadcasting its Part 0 value across all parts (which now seems potentially misleading)?
> 
> > Yes, for ActiveLane we only need the first scalar part of each lane. I think it might be better to have a special version of build-scalar-steps that only generates the first lane as follow-up to the build-scalar-steps patches.
> 
> Follow-up TODO would be fine.
> 
> Still would be good to try and simplify the logic here that checks if OriginalIV cannot replace NewIV (but instead a StepVector'd CanonicalIV replaces NewIV): FoldTail must be true given that NewIV exists, so can be asserted or not passed-in; which leaves
> 
> Range includes only positive VF's &&
> onlyFirstLaneUsed(WidenOriginalIV) &&
> onlyScalarsUsed(WidenOriginalIV) &&
> NewIV feeds some VPInstruction that isn't ActiveLaneMask
> ? 
The logic here has been simplified with help of D118167




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116123/new/

https://reviews.llvm.org/D116123



More information about the llvm-commits mailing list