[PATCH] D158779: [VPlan] Add active-lane-mask as VPlan-to-VPlan transformation.
Florian Hahn via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 8 10:00:01 PDT 2023
fhahn added inline comments.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8684
Instruction *DLInst =
getDebugLocFromInstOrOperands(Legal->getPrimaryInduction());
addCanonicalIVRecipes(*Plan, Legal->getWidestInductionType(),
----------------
Ayal wrote:
> nit: may be clearer to set and pass `bool HasNUW = CM.getTailFoldingStyle() == TailFoldingStyle::None;`, and possibly `auto DL = DLInst ? DLInst->getDebugLoc() : DebugLoc());`
Simplified, separately updated `getDebugLocFromInstOrOperands` to return the DebugLoc directly in 08de6508ab6a
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8856
+ auto Style = CM.getTailFoldingStyle(IVUpdateMayOverflow);
+ if (useActiveLaneMask(Style))
----------------
Ayal wrote:
> nit: may be slightly clearer to set and pass
>
> ```
> bool ForControlFlow = useActiveLaneMaskForControlFlow(Style);
> bool WithoutRuntimeCheck = Style == TailFoldingStyle::DataAndControlFlowWithoutRuntimeCheck;
> ```
Adjusted, thanks!
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:814
+
+// Add a VPActiveLaneMaskPHIRecipe to \p Plan.
+static VPActiveLaneMaskPHIRecipe *
----------------
Ayal wrote:
> This adds much more than a VPActiveLaneMaskPHIRecipe. One part adds recipes in preheader and this phi, returning it for further/any use. Another part replaces the loop branch with several recipes including an ActiveLaneMask. The two parts are apparently independent of each other.
Clarified the name & comment. The branch part only adds a Not and Branch recipe, so not sure if it is worth splitting this off?
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:823
+ // Create the active lane mask instruction in the vplan preheader.
+ VPBasicBlock *VecPreheader =
+ cast<VPBasicBlock>(Plan.getVectorLoopRegion()->getSinglePredecessor());
----------------
ABataev wrote:
> `auto *`
Adjusted, thanks!
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:853
+ // induction variable increment by VF, we can increment the value before
+ // the get.active.lane mask and use the unmodified tripcount.
+ IncrementValue = CanonicalIVPHI->getOperand(1);
----------------
Ayal wrote:
> nit: this seems like the simpler case so should probably go first.
Reordered the blocks, thanks!
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:854
+ // the get.active.lane mask and use the unmodified tripcount.
+ IncrementValue = CanonicalIVPHI->getOperand(1);
+ TripCount = TC;
----------------
Ayal wrote:
> nit: aka CanonicalIVIncrement?
Yep, adjusted, thanks!
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:858
+
+ EB->getTerminator()->eraseFromParent();
+ auto *EntryALM = new VPInstruction(VPInstruction::ActiveLaneMask,
----------------
Ayal wrote:
> Better to erase the (BranchOnCount?) terminator next to inserting the new one which replaces it below - a BranchOnCond which effectively turns the loop into an uncountable one.
Updated to use VPBuilder to construction (and insert) recipes and moved removal down.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:870
+ // Create the active lane mask for the next iteration of the loop.
+ CanonicalIVIncrementParts =
+ new VPInstruction(VPInstruction::CanonicalIVIncrementForPart,
----------------
Ayal wrote:
> Better to use separate variables than redefine same `CanonicalIVIncrementParts` twice. As in ALM and EntryALM. Especially for something canonical...
Split up, thanks!
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:886
+
+ VPInstruction *BranchBack =
+ new VPInstruction(VPInstruction::BranchOnCond, {NotMask}, DL);
----------------
ABataev wrote:
> auto *
Now that this is using VPBuilder, there's no. need to assign the result, thanks!
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:892-893
+
+// Helper to iterate over all users of \p Start recursively and call \p Fn the
+// users. Iteration stops once \p Fn returns true.
+static void vp_for_all_users(VPUser *Start, function_ref<bool(VPUser *U)> Fn) {
----------------
Ayal wrote:
>
Adjusted, thanks!
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:894
+// users. Iteration stops once \p Fn returns true.
+static void vp_for_all_users(VPUser *Start, function_ref<bool(VPUser *U)> Fn) {
+ SmallSetVector<VPUser *, 4> Worklist;
----------------
Ayal wrote:
> Would `vp_find_if_user()` be more accurate, as in std::find_if(), returning the found VPUser* or null?
Bool return value was intended to let the callback stop iteration, but it is not needed in the current version, removed the bool return value.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:903
+ }
+ if (Fn(U))
+ return;
----------------
Ayal wrote:
> Ayal wrote:
> > ABataev wrote:
> > > Looks like boolean result of Fn is not used, you can make it simply return void.
> > nit: check if (Fn(U)) before inserting U's users to Worklist.
> Agreed, Fn always returns false below, resulting in "for_all". Suggest to use "find_if" instead, in this case.
Boolean removed, thanks!
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:903-904
+ }
+ if (Fn(U))
+ return;
+ }
----------------
fhahn wrote:
> Ayal wrote:
> > Ayal wrote:
> > > ABataev wrote:
> > > > Looks like boolean result of Fn is not used, you can make it simply return void.
> > > nit: check if (Fn(U)) before inserting U's users to Worklist.
> > Agreed, Fn always returns false below, resulting in "for_all". Suggest to use "find_if" instead, in this case.
> Boolean removed, thanks!
Updated to use `void`, thanks!
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:913
+ UseActiveLaneMaskForControlFlow
+ ? addVPLaneMaskPhi(Plan, DataAndControlFlowWithoutRuntimeCheck)
+ : nullptr;
----------------
Ayal wrote:
> This is a major part of the transform. If replacing the loop branch can indeed be separated from generating LaneMaskPhi, perhaps it may be clearer to have here
>
> ```
> if (UseActiveLaneMaskForControlFlow)
> useActiveLaneMaskForLoopBranch(Plan, DataAndControlFlowWithoutRuntimeCheck);
>
> ```
> and later have
>
>
> ```
> if (UseActiveLaneMaskForControlFlow) {
> auto *LaneMaskPhi = useActiveLaneMaskForPhi(Plan);
> VPI->replaceAllUsesWith(LaneMaskPhi);
> }
> else { ...
> ```
I am not sure, creating the phi and adjusting the branch is tied together at the moment, as in whenever we introduce a active-lane-mask-phi we will also replace the branch.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:917
+ // Walk users of the canonical induction and replace all compares of the form
+ // (ICMP_ULE, wide canonical IV, backedge-taken-count) with an
+ // active-lane-mask.
----------------
Ayal wrote:
> How many instances of the same "WidenIV <= BTC" are we expecting?
> Why replace each with a separate copy of ActiveLaneMask(IV, TC)?
> Suffice to find_if one exists and process it? Can then verify find_if again does not find another.
Updated to re-use created lane mask. At the moment there should be only one, but not sure if it is worth relying on that here.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:928
+ if (LaneMaskPhi) {
+ VPI->replaceAllUsesWith(LaneMaskPhi);
+ } else {
----------------
Ayal wrote:
> nit: erase from parent here too, or leave both to DCE? Erasing a user during traversal?
Restructured the code to replace & erase once for both cases.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158779/new/
https://reviews.llvm.org/D158779
More information about the llvm-commits
mailing list