[PATCH] D116123: [VPlan] Handle IV vector splat using VPWidenCanonicalIV.
Florian Hahn via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 27 01:54:55 PST 2022
fhahn added inline comments.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2566
+ // Create a new independent vector induction variable, if one is needed.
+ if (!Def->needsScalarIVOnly())
createVectorIntOrFpInductionPHI(ID, Step, Start, EntryVal, Def, State);
----------------
Ayal wrote:
> OK, this is synonymous with NeedsVectorIV(), capturing it more accurately.
>
> Follows the above comment:
> >Perhaps VPWidenIntOrFpInductionRecipe should know of NeedsScalarIV and NeedsVectorIV (names to be improved), pass them to ILV->widenIntOrFpInduction() and to others - see removeRedundantStepVector below?
Updated D118167 to use `needsVectorIV`
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:356
+ if (Range.Start.isScalar())
+ Range.End = Range.Start * 2;
+
----------------
Ayal wrote:
> fhahn wrote:
> > Ayal wrote:
> > > Is this optimization expected to clamp Range.End for the current VPlan during VPlans-for-ranges construction?
> > > Is there a need to (only) know if VPlan's range is scalar? Specifically, if Start is Scalar then End can be asserted to be Start*2, if needed? Suffice the check if all recipes provide scalars only, although faster to check if range is scalar?
> > It's sufficient to check whether the start is scalar. There's no need to clamp the end.
> Does the following summarize the logic here: Original and New IV's can each provide either scalar only, vector only, or both. Original can replace New iff it provides whatever New needs to provide.
>
> Have VPWidenCanonicalIVRecipe also support needsScalarIVOnly() - by checking if all it's users are ActiveLaneMask VPInstructions, and needsScalarIV() - by checking if any user is ActiveLaneMask?
>
> Then check
>
> ```
> if ((WidenNewIV->needsScalarIV() && WidenOriginalIV->needsScalarIV()) ||
> (!WidenNewIV->needsScalarIVOnly() && !WidenOriginalIV->needsScalarIVOnly())) {
> WidenNewIV->replaceAllUsesWith(WidenOriginalIV);
> WidenNewIV->eraseFromParent();
> }
> ```
>
> ?
> Does the following summarize the logic here: Original and New IV's can each provide either scalar only, vector only, or both. Original can replace New iff it provides whatever New needs to provide.
Yes, that's a better summary! I updated the comment.
I am not sure about adding `needsScalarIVOnly`/`needsScalarIV` to `VPWidenCanonicalIVRecipe`, as this distinction seems only relevant here and might muddy the waters a bit, as the recipe will always generate a vector value (and I don't think we should change that). I left the check here as is for now, but once D116554 lands this check can be replaced with `onlyFirstLaneUsed(WidenNewIV)`.
I think to wrap things up, it is worth to keep the current order of patches and have the temporary check of `WidenNewIV`'s users here, to simplify the patch ordering.
================
Comment at: llvm/test/Transforms/LoopVectorize/first-order-recurrence-sink-replicate-region.ll:175
; CHECK-NEXT: WIDEN-REDUCTION-PHI ir<%and.red> = phi ir<1234>, ir<%and.red.next>
-; CHECK-NEXT: EMIT vp<[[MASK:%.+]]> = icmp ule ir<%iv> vp<[[BTC]]>
+; CHECK-NEXT: EMIT vp<[[WIDEN_CAN:%.+]]> = WIDEN-CANONICAL-INDUCTION vp<%2>
+; CHECK-NEXT: EMIT vp<[[MASK:%.+]]> = icmp ule vp<[[WIDEN_CAN]]> vp<[[BTC]]>
----------------
Ayal wrote:
> Should vp<%2> be vp<[[CAN_IV]]> ?
Yep, updated, thanks!
================
Comment at: llvm/test/Transforms/LoopVectorize/pr45679-fold-tail-by-masking.ll:153
; VF1UF4: pred.store.if:
-; VF1UF4-NEXT: [[TMP4:%.*]] = getelementptr inbounds i32, i32* [[A:%.*]], i32 [[INDUCTION]]
+; VF1UF4-NEXT: [[SUNK_IND0:%.*]] = add i32 [[INDEX]], 0
+; VF1UF4-NEXT: [[TMP4:%.*]] = getelementptr inbounds i32, i32* [[A:%.*]], i32 [[SUNK_IND0]]
----------------
Ayal wrote:
> Hmm, this enables sink scalar operands to handle scalar steps.
I think this is caused by the drop of the check for the scalar VF; we add the VPWidenCanonicalInduction, but the cost model knows that the induction recipe will be scalar only, so we cannot replace the VPWidenCanonicalInduction with the other induction recipe and we have some extra steps. Given that the VF=1,UF>1 is somewhat of an edge case, it seems cleaner to not check for the scalar VF, as suggested earlier.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D116123/new/
https://reviews.llvm.org/D116123
More information about the llvm-commits
mailing list