[PATCH] D155845: [VPlan] Fix in-loop reduction chains using VPlan def-use chains (NFCI)
Ayal Zaks via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Jul 22 15:46:20 PDT 2023
Ayal added a comment.
Thanks for following-up on this nice cleanup!
Looking over the code raises various initial thoughts, which could be pursued in several steps/patches.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1611
bool isInLoopReduction(PHINode *Phi) const {
- return InLoopReductionChains.count(Phi);
+ return InLoopReductions.count(Phi);
}
----------------
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1775-1777
/// PHINodes of the reductions that should be expanded in-loop along with
/// their associated chains of reduction operations, in program order from top
/// (PHI) to bottom
----------------
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1782-1783
/// FIXME: This can be removed once reductions can be costed correctly in
/// vplan. This was added to allow quick lookup to the inloop operations,
/// without having to loop through InLoopReductionChains.
DenseMap<Instruction *, Instruction *> InLoopReductionImmediateChains;
----------------
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9152
ElementCount MinVF) {
- for (const auto &Reduction : CM.getInLoopReductionChains()) {
- PHINode *Phi = Reduction.first;
- const RecurrenceDescriptor &RdxDesc =
- Legal->getReductionVars().find(Phi)->second;
- const SmallVector<Instruction *, 4> &ReductionOperations = Reduction.second;
+ SmallVector<VPRecipeBase *> Phis;
+ for (VPRecipeBase &R :
----------------
?
Can also filter out non-isOrdered phi's when VF=1, to then iterate only over phi's of interest.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9156
+ Phis.push_back(&R);
+
+ for (VPRecipeBase *R : Phis) {
----------------
Bypass this `for` loop if MinVF.isScalar()?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9166-9167
+ SmallVector<VPValue *, 8> Worklist;
+ SmallPtrSet<VPValue *, 8> Visited;
+ Worklist.push_back(PhiR);
----------------
Would one SetVector work?
This represents the `Chain` - should it hold Recipes as `Links`, each residing inside the loop and having a single user (except last with live-out), rather than VPValues?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9170
+ Visited.insert(PhiR);
+
+ for (unsigned I = 0; I != Worklist.size(); ++I) {
----------------
Worth a comment explaining what's going to happen next.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9183-9186
// ReductionOperations are orders top-down from the phi's use to the
// LoopExitValue. We keep a track of the previous item (the Chain) to tell
// which of the two operands will remain scalar and which will be reduced.
// For minmax the chain will be the select instructions.
----------------
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9187
// For minmax the chain will be the select instructions.
- Instruction *Chain = Phi;
- for (Instruction *R : ReductionOperations) {
- VPRecipeBase *WidenRecipe = RecipeBuilder.getRecipe(R);
+ VPRecipeBase *Chain = PhiR;
+ for (unsigned I = 1; I != Worklist.size(); ++I) {
----------------
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9188
+ VPRecipeBase *Chain = PhiR;
+ for (unsigned I = 1; I != Worklist.size(); ++I) {
+ VPValue *ChainOp = Chain->getVPSingleValue();
----------------
Could this `for` loop be replaced by a simpler `for (auto Link : Chain) {` loop?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9189
+ for (unsigned I = 1; I != Worklist.size(); ++I) {
+ VPValue *ChainOp = Chain->getVPSingleValue();
+ VPRecipeBase *WidenRecipe = Worklist[I]->getDefiningRecipe();
----------------
Above initialization of Worklist tolerates recipes with multiple defined values, but here only single valued recipes are expected?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9189
+ for (unsigned I = 1; I != Worklist.size(); ++I) {
+ VPValue *ChainOp = Chain->getVPSingleValue();
+ VPRecipeBase *WidenRecipe = Worklist[I]->getDefiningRecipe();
----------------
Ayal wrote:
> Above initialization of Worklist tolerates recipes with multiple defined values, but here only single valued recipes are expected?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9190
+ VPValue *ChainOp = Chain->getVPSingleValue();
+ VPRecipeBase *WidenRecipe = Worklist[I]->getDefiningRecipe();
+
----------------
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9192
+
RecurKind Kind = RdxDesc.getRecurrenceKind();
+ Instruction *R = WidenRecipe->getUnderlyingInstr();
----------------
Hoist `Kind` and its assert for being a select-compare kind next to setting `RdxDesc` above?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9193
RecurKind Kind = RdxDesc.getRecurrenceKind();
+ Instruction *R = WidenRecipe->getUnderlyingInstr();
----------------
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9195
- VPValue *ChainOp = Plan->getVPValue(Chain);
unsigned FirstOpId;
assert(!RecurrenceDescriptor::isSelectCmpRecurrenceKind(Kind) &&
----------------
`IndexOfFirstOperand`?
Initialize? Comment what this variable holds? (Admittedly missing from current code base.)
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9197
assert(!RecurrenceDescriptor::isSelectCmpRecurrenceKind(Kind) &&
"Only min/max recurrences allowed for inloop reductions");
// Recognize a call to the llvm.fmuladd intrinsic.
----------------
Comment deserves update - FMulAdd is also supported.
Potentially renamed isAnyOfRecurrenceKind() in D155786.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9205-9206
+ continue;
assert(isa<VPWidenSelectRecipe>(WidenRecipe) &&
"Expected to replace a VPWidenSelectSC");
FirstOpId = 1;
----------------
Dead assert.
Worth a comment why we `continue` here - to skip over the compare of the select?
What about Fminimum/Fmaximum reductions which are isMinMaxRecurrenceKind but have a call to an intrinsic as their WidenRecipe, rather than a compare-select?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9211
(IsFMulAdd && isa<VPWidenCallRecipe>(WidenRecipe))) &&
"Expected to replace a VPWidenSC");
FirstOpId = 0;
----------------
comment appears out of date.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9216
+ ? FirstOpId + 1
+ : FirstOpId;
+ VPValue *VecOp = WidenRecipe->getOperand(VecOpId);
----------------
It's somewhat alarming that we're indifferent to which operand is VecOp, in non commutative cases such as compare-select. The semantics of the compare/select are presumably captured by the min/max kind, rather than the condition itself (or its negation). Worth a comment, and ensuring test coverage of having VecOp in both positions.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9219
+ BasicBlock *BB = WidenRecipe->getUnderlyingInstr()->getParent();
VPValue *CondOp = nullptr;
----------------
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9219
+ BasicBlock *BB = WidenRecipe->getUnderlyingInstr()->getParent();
VPValue *CondOp = nullptr;
----------------
Ayal wrote:
>
Or rather `BasicBlock *CurrentLinkBB = CurrentLinkI->getParent();`
Can also define and reuse: `VPBasicBlock *CurrentLinkVPBB = CurrentLink->getParent();`
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9225
WidenRecipe->getIterator());
CondOp = RecipeBuilder.createBlockInMask(R->getParent(), *Plan);
}
----------------
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9228
+ Chain = WidenRecipe;
if (IsFMulAdd) {
----------------
Is this setting of Chain needed?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9232
// need to create an fmul recipe to use as the vector operand for the
// fadd reduction.
VPInstruction *FMulRecipe = new VPInstruction(
----------------
Should this breaking of fmuladd into an fadd recurrence fed by an fmul take place separately, as a preparatory step?
Should compare-select pairs be replaced by Min/Max recipes?
In order to convert an (initially widened) reduction recurrence into an in-loop one, it seems that isolating the reduction operation into a single, dedicated operation would be useful. It could also help canonicalize where to find the reduced value operands, aka vecOps.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9241
VPReductionRecipe *RedRecipe =
- new VPReductionRecipe(&RdxDesc, R, ChainOp, VecOp, CondOp, &TTI);
- WidenRecipe->getVPSingleValue()->replaceAllUsesWith(RedRecipe);
- Plan->removeVPValueFor(R);
- Plan->addVPValue(R, RedRecipe);
+ new VPReductionRecipe(&RdxDesc, WidenRecipe->getUnderlyingInstr(),
+ ChainOp, VecOp, CondOp, &TTI);
----------------
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9251
WidenRecipe->eraseFromParent();
+ Chain = RedRecipe;
----------------
Can this setting of Chain continue to be placed at the end of the loop below?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9262
+ if (Chain == PhiR)
+ break;
}
----------------
Why/Is this needed?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D155845/new/
https://reviews.llvm.org/D155845
More information about the llvm-commits
mailing list