[PATCH] D158779: [VPlan] Add active-lane-mask as VPlan-to-VPlan transformation.

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 19 12:08:16 PDT 2023


fhahn added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8951
+  bool HasNUW = true;
+  addCanonicalIVRecipes(*Plan, Legal->getWidestInductionType(), true,
+                        DebugLoc());
----------------
Ayal wrote:
> 
Argh, yes, thanks!


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:869
+  auto *CanonicalIVIncrement =
+      cast<VPInstruction>(CanonicalIVPHI->getBackedgeValue());
+  CanonicalIVIncrement->dropPoisonGeneratingFlags();
----------------
Ayal wrote:
> nit: worth commenting why poison generating flags are dropped from CanonicalIVIncrement.
IIUC we should only need to drop them for `DataAndControlFlowWithoutRuntimeCheck`? Might be good to move the code as follow-up and document then.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:956
+                Plan, DataAndControlFlowWithoutRuntimeCheck)
+          : nullptr;
+
----------------
Ayal wrote:
> Instead of setting to nullptr, create an ALM instruction following canonical IV widened/incrementedForPart placing both before first non-phi of loop header, to make sure the former can serve all potential users of  ICMP_ULE's?
I adjusted the code to first look for the widened IV. If it exists and `!UseActiveLaneMaskForControlFlow`, create ALM up front.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:961
+  // active-lane-mask.
+  VPValue *BTC = Plan.getOrCreateBackedgeTakenCount();
+  vp_for_all_users(BTC, [&](VPUser *U) {
----------------
Ayal wrote:
> Only immediate users of BTC are sought, suffice to traverse them via 
> `  for (VPUser &U : BTC->users()) {`
> rather than
> `  vp_for_all_users() {` ?
> 
Adjusted the code to first look for the wide canonical IV and then iterate here over just its users. Overall that makes the structure of the code easier to follow IMO.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:968
+        !isa<VPWidenCanonicalIVRecipe>(CompareToReplace->getOperand(0)) ||
+        CompareToReplace->getOperand(1) != BTC) {
+      return;
----------------
Ayal wrote:
> nit: can assert operand 1 is BTC, given that operand 0 is not and the 2-operand compare uses BTC.
added assert, thanks!


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:976
+      LaneMask = new VPInstruction(VPInstruction::ActiveLaneMask,
+                                   {WideCanonicalIV, Plan.getTripCount()},
+                                   nullptr, "active.lane.mask");
----------------
Ayal wrote:
> The first operand of ALM when addVPLaneMaskPhiAndUpdateExitBranch() generates it in
>  - preheader is a `CanonicalIVIncrementForPart(StartV)`
>  - latch is `CanonicalIVIncrementForPart(CanonicalIVPHI/CanonicalIVIncrement)`
> When generating it here in header it is `VPWidenCanonicalIVRecipe(CanonicalIVPHI)`.
> 
> Can ALM use CanonicalIVIncrementForPart(CanonicalIVPHI) here too, for consistency?
> 
> Are VPWidenCanonicalIVRecipe and CanonicalIVIncrementForPart interchangeable, are both needed?
ALM needs scalar arguments, so it uses lane 0. This means here it would be better to use the canonical IV instead of the widened recipe. There are some small codegen changes, so I think that would be better as follow up.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158779/new/

https://reviews.llvm.org/D158779



More information about the llvm-commits mailing list