[PATCH] D116123: [VPlan] Handle IV vector splat as VPInstruction.

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 13 01:44:00 PST 2022


fhahn added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2556
+      if (Step->getType()->isFloatingPointTy()) {
+        StartIdx = Builder.CreateUIToFP(StartIdx, Step->getType());
+        // Floating-point operations inherit FMF via the builder's flags.
----------------
Ayal wrote:
> Set StartIdx to int or fp from its beginning? Indep. of this patch.
Landed the change separately in 7ce48be0fd83 . StartIdx depends on `Part` though, so I left it in the loop for now.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2563
+        EntryPart = Builder.CreateAdd(
+            ScalarIV, Builder.CreateMul(StartIdx, Step), "induction");
+      }
----------------
Ayal wrote:
> Would be good to fold int and fp handlings, indep. of this patch:
> Set and use `MulOp` here as well?
> Use CreateBinOp here as well?
> Name the instruction "induction" above for fp as well?
> OTOH, hanlding Trunc should apply to int only?
Thanks, I applied the folding and added an assert in 7ce48be0fd83.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8442
+        IV = Plan->getOrAddVPValue(Legal->getPrimaryInduction());
+      }
       BlockMask = Builder.createNaryOp(VPInstruction::ActiveLaneMask, {IV});
----------------
Ayal wrote:
> fhahn wrote:
> > Ayal wrote:
> > > fhahn wrote:
> > > > Ayal wrote:
> > > > > fhahn wrote:
> > > > > > fhahn wrote:
> > > > > > > fhahn wrote:
> > > > > > > > Ayal wrote:
> > > > > > > > > No need for {}
> > > > > > > > > 
> > > > > > > > > So a StepVector is needed only to feed ICmpULE, not ActiveLaneMask which depends on first lane only? Seems like an optimization best applied separately?
> > > > > > > > > 
> > > > > > > > > BTC is no longer used by ActiveLaneMask, so setting it should sink below to where ICmpULE needs it (While we're here, irrespective of this patch).
> > > > > > > > > No need for {}
> > > > > > > > 
> > > > > > > > I think if the body spans multiple lines due to a comment, braces should not be omitted
> > > > > > > > https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements
> > > > > > > > 
> > > > > > > > > So a StepVector is needed only to feed ICmpULE, not ActiveLaneMask which depends on first lane only? Seems like an optimization best applied separately?
> > > > > > > > 
> > > > > > > > Yes, let me try to split that off. `ActiveLaneMask` only needs the first scalar step of each `Part`.
> > > > > > > > 
> > > > > > > > > BTC is no longer used by ActiveLaneMask, so setting it should sink below to where ICmpULE needs it (While we're here, irrespective of this patch).
> > > > > > > > 
> > > > > > > > done separately in 2e630eabd329
> > > > > > > >> So a StepVector is needed only to feed ICmpULE, not ActiveLaneMask which depends on first lane only? Seems like an optimization best applied separately?
> > > > > > > >Yes, let me try to split that off. ActiveLaneMask only needs the first scalar step of each Part.
> > > > > > > 
> > > > > > > I just double checked and we may still need to go the route with the widened recipe if there's no primary induction. But there's no test case without primary inductions that use active.lane.mask. I;ll push one and update the patch.
> > > > > > I missed something here and IIUC your comment was regarding creating a step vector in both cases at first, right?
> > > > > > 
> > > > > > The issue with that is that in some cases `ActiveLaneMask` will use the step-vector instead of the scalar steps, so I left it as is for now.
> > > > > > 
> > > > > > > I just double checked and we may still need to go the route with the widened recipe if there's no primary induction. 
> > > > > > 
> > > > > > The current patch still creates the vector phi when necessary and the existing tests cover that already.
> > > > > Yes, my comment refers to the fact that currently both ActiveLaneMask and ICmpULE are treated the same when Legal has a PrimaryInduction() - both use the same IV = Plan->getOrAddVPValue(Legal->getPrimaryInduction());
> > > > > This patch continues to feed ActiveLaneMask with this IV, but feeds ICmpULE with a StepVector.
> > > > > Yes, my comment refers to the fact that currently both ActiveLaneMask and ICmpULE are treated the same when Legal has a PrimaryInduction() - both use the same IV = Plan->getOrAddVPValue(Legal->getPrimaryInduction());
> > > > 
> > > > Ah I see. Yes, currently they both use the same VPValue, because the single recipe creates both the scalar steps and the step vector. But ActiveLaneMask only needs the steps and ICmpULE only needs the step vector. This is explicit now in the code as a result of breaking this part of the phi handling into distinct pieces.
> > > > Ah I see. Yes, currently they both use the same VPValue, because the single recipe creates both the scalar steps and the step vector. But ActiveLaneMask only needs the steps and ICmpULE only needs the step vector. This is explicit now in the code as a result of breaking this part of the phi handling into distinct pieces.
> > > 
> > > ok, that confirms 1st Q above, what about the 2nd: Seems like an optimization best applied separately?
> > I updated the patch to always generate widened induction here, and added a transform to replace it if possible later.
> Better have the transform in a separate patch, to retain exiting behavior here?
Good idea, split off to D117140


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:856
+  }
+
   unsigned getOpcode() const { return Opcode; }
----------------
Ayal wrote:
> fhahn wrote:
> > Ayal wrote:
> > > Where are these new classof's needed?
> > Unfortunately this is needed to `dyn_cast` directly from VPUser -> VPInstruction
> document with said need?
Added documentation and also added it for other recipes where needed.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:336
+  VPWidenIntOrFpInductionRecipe *WidenInduction = nullptr;
+  for (VPRecipeBase &Phi : *HeaderVPBB) {
+    if (auto *W = dyn_cast<VPWidenCanonicalIVRecipe>(&Phi)) {
----------------
Ayal wrote:
> `Phi`: VPWidenIntOrFpInductionRecipe represents a phi (assumed to exist, corresponding to OriginalCanonicalIV) but VPWidenCanonicalIVRecipe represents a non-phi and thus appears after the former. Ordering their handling accordingly is more logical? Break as soon as (if) the latter is found?
> 
> WidenCan >> WidenNewIV?
> WidenInduction >> WidenOriginalIV?
> (both are canonical...)
Updated in the split-off review.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:356
+  if (Range.Start.isScalar())
+    Range.End = Range.Start * 2;
+
----------------
Ayal wrote:
> Is this optimization expected to clamp Range.End for the current VPlan during VPlans-for-ranges construction?
> Is there a need to (only) know if VPlan's range is scalar? Specifically, if Start is Scalar then End can be asserted to be Start*2, if needed? Suffice the check if all recipes provide scalars only, although faster to check if range is scalar?
It's sufficient to check whether the start is scalar. There's no need to clamp the end.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:373
+
+  assert(WidenInduction && "expected recipe for original canonical IV");
+
----------------
Ayal wrote:
> Place next to above early-exit if !WidenCan?
Moved!


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:390
+        return !VPI || VPI->getOpcode() == VPInstruction::ActiveLaneMask;
+      })) {
+    WidenCan->replaceAllUsesWith(WidenInduction);
----------------
Ayal wrote:
> WidenCan is going to be replaced, question is by what; check instead if StepVector is needed (one recipe producing only scalars the other vectors (deserves more review thoughts...)) and if so create it, followed by a single replacing and erasing of WidenCan?
Updated to check if `StepVector` is needed and sink replace & erase code.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:396
+  // Retain WidenCan fora scalar plans.
+  if (Range.Start.isScalar())
+    return;
----------------
Ayal wrote:
> Scalar Start handled above (w/o retaining WidenCan)?
This is not needed in the latest version, removed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116123/new/

https://reviews.llvm.org/D116123



More information about the llvm-commits mailing list