[PATCH] D158779: [VPlan] Add active-lane-mask as VPlan-to-VPlan transformation.
Ayal Zaks via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 30 15:38:46 PDT 2023
Ayal added a comment.
Trying to simplify and clarify the logic as it gets refactored.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8684
Instruction *DLInst =
getDebugLocFromInstOrOperands(Legal->getPrimaryInduction());
addCanonicalIVRecipes(*Plan, Legal->getWidestInductionType(),
----------------
nit: may be clearer to set and pass `bool HasNUW = CM.getTailFoldingStyle() == TailFoldingStyle::None;`, and possibly `auto DL = DLInst ? DLInst->getDebugLoc() : DebugLoc());`
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8856
+ auto Style = CM.getTailFoldingStyle(IVUpdateMayOverflow);
+ if (useActiveLaneMask(Style))
----------------
nit: may be slightly clearer to set and pass
```
bool ForControlFlow = useActiveLaneMaskForControlFlow(Style);
bool WithoutRuntimeCheck = Style == TailFoldingStyle::DataAndControlFlowWithoutRuntimeCheck;
```
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:814
+
+// Add a VPActiveLaneMaskPHIRecipe to \p Plan.
+static VPActiveLaneMaskPHIRecipe *
----------------
This adds much more than a VPActiveLaneMaskPHIRecipe. One part adds recipes in preheader and this phi, returning it for further/any use. Another part replaces the loop branch with several recipes including an ActiveLaneMask. The two parts are apparently independent of each other.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:853
+ // induction variable increment by VF, we can increment the value before
+ // the get.active.lane mask and use the unmodified tripcount.
+ IncrementValue = CanonicalIVPHI->getOperand(1);
----------------
nit: this seems like the simpler case so should probably go first.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:854
+ // the get.active.lane mask and use the unmodified tripcount.
+ IncrementValue = CanonicalIVPHI->getOperand(1);
+ TripCount = TC;
----------------
nit: aka CanonicalIVIncrement?
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:858
+
+ EB->getTerminator()->eraseFromParent();
+ auto *EntryALM = new VPInstruction(VPInstruction::ActiveLaneMask,
----------------
Better to erase the (BranchOnCount?) terminator next to inserting the new one which replaces it below - a BranchOnCond which effectively turns the loop into an uncountable one.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:870
+ // Create the active lane mask for the next iteration of the loop.
+ CanonicalIVIncrementParts =
+ new VPInstruction(VPInstruction::CanonicalIVIncrementForPart,
----------------
Better to use separate variables than redefine same `CanonicalIVIncrementParts` twice. As in ALM and EntryALM. Especially for something canonical...
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:892-893
+
+// Helper to iterate over all users of \p Start recursively and call \p Fn the
+// users. Iteration stops once \p Fn returns true.
+static void vp_for_all_users(VPUser *Start, function_ref<bool(VPUser *U)> Fn) {
----------------
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:894
+// users. Iteration stops once \p Fn returns true.
+static void vp_for_all_users(VPUser *Start, function_ref<bool(VPUser *U)> Fn) {
+ SmallSetVector<VPUser *, 4> Worklist;
----------------
Would `vp_find_if_user()` be more accurate, as in std::find_if(), returning the found VPUser* or null?
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:903
+ }
+ if (Fn(U))
+ return;
----------------
ABataev wrote:
> Looks like boolean result of Fn is not used, you can make it simply return void.
nit: check if (Fn(U)) before inserting U's users to Worklist.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:903-904
+ }
+ if (Fn(U))
+ return;
+ }
----------------
Ayal wrote:
> ABataev wrote:
> > Looks like boolean result of Fn is not used, you can make it simply return void.
> nit: check if (Fn(U)) before inserting U's users to Worklist.
Agreed, Fn always returns false below, resulting in "for_all". Suggest to use "find_if" instead, in this case.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:913
+ UseActiveLaneMaskForControlFlow
+ ? addVPLaneMaskPhi(Plan, DataAndControlFlowWithoutRuntimeCheck)
+ : nullptr;
----------------
This is a major part of the transform. If replacing the loop branch can indeed be separated from generating LaneMaskPhi, perhaps it may be clearer to have here
```
if (UseActiveLaneMaskForControlFlow)
useActiveLaneMaskForLoopBranch(Plan, DataAndControlFlowWithoutRuntimeCheck);
```
and later have
```
if (UseActiveLaneMaskForControlFlow) {
auto *LaneMaskPhi = useActiveLaneMaskForPhi(Plan);
VPI->replaceAllUsesWith(LaneMaskPhi);
}
else { ...
```
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:917
+ // Walk users of the canonical induction and replace all compares of the form
+ // (ICMP_ULE, wide canonical IV, backedge-taken-count) with an
+ // active-lane-mask.
----------------
How many instances of the same "WidenIV <= BTC" are we expecting?
Why replace each with a separate copy of ActiveLaneMask(IV, TC)?
Suffice to find_if one exists and process it? Can then verify find_if again does not find another.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:928
+ if (LaneMaskPhi) {
+ VPI->replaceAllUsesWith(LaneMaskPhi);
+ } else {
----------------
nit: erase from parent here too, or leave both to DCE? Erasing a user during traversal?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158779/new/
https://reviews.llvm.org/D158779
More information about the llvm-commits
mailing list