[PATCH] D100113: [LV] Move reduction PHI node fixup to VPlan::execute (NFC).
Ayal Zaks via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon May 3 00:47:51 PDT 2021
Ayal added inline comments.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9242
RecurKind Kind = RdxDesc->getRecurrenceKind();
- bool IsOrdered = useOrderedReductions(*RdxDesc);
+ bool IsOrdered = LoopVectorizationPlanner::useOrderedReductions(*RdxDesc);
Value *NewVecOp = State.get(getVecOp(), Part);
----------------
better turn on an IsOrdered bit inside VPReductionRecipe during its creation or planning, than during its execute? Admittedly goes back to D98435.
There are 3 types of reductions: (1) parallel lanes: each lane computes an independent partial-sum; (2) parallel parts: each vector computes an independent partial sum, in-loop within vectors but not across parts, and (3) serial: in-loop across vectors and parts, the header phi operates on a single scalar. Would be good to explicitly indicate which type of reduction is chosen, in both ReductionRecipe(s) and the header phi; it effects their code-gen.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:69
+bool LoopVectorizationPlanner::useOrderedReductions(
+ RecurrenceDescriptor &RdxDesc) {
----------------
Defining a method of LVP inside VPlan.cpp seems odd, at best.
The use of a gating EnableStrictReductions knob (in contrast to a CM cost-based decision), belongs to Legal rather than CM or Planner? I.e., Legal should bail out from vectorizing a loop having an ordered reduction, when !EnableStrictReductions? Admittedly goes back to D98435.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:765
+ for (unsigned Part = 0; Part < State->UF; ++Part) {
+ auto *VecRdxPhi = cast<PHINode>(State->get(PhiR->getVPValue(0), Part));
+ Value *Val = State->get(PhiR->getOperand(1), Part);
----------------
Moving this to VPlan::execute is great!
Can use getSingleVPValue(), getBackedgeValue().
May be worth explaining that Val is first set per-part supporting parallel partial sums, either vector/per-lane (1) or scalar/per-part (2), and then set to last-part supporting serial reduction (3).
Admittedly goes back to D98435.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D100113/new/
https://reviews.llvm.org/D100113
More information about the llvm-commits
mailing list