[llvm] [LV] Vectorize FMax w/o fast-math flags. (PR #146711)
Florian Hahn via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 7 02:39:03 PDT 2025
================
@@ -589,3 +589,114 @@ void VPlanTransforms::createLoopRegions(VPlan &Plan) {
TopRegion->setName("vector loop");
TopRegion->getEntryBasicBlock()->setName("vector.body");
}
+
+bool VPlanTransforms::handleFMaxReductionsWithoutFastMath(VPlan &Plan) {
+ VPRegionBlock *LoopRegion = Plan.getVectorLoopRegion();
+ VPReductionPHIRecipe *RedPhiR = nullptr;
+ VPRecipeWithIRFlags *MinMaxOp = nullptr;
+ VPWidenIntOrFpInductionRecipe *WideIV = nullptr;
+
+ // Check if there are any FMaxNoFMFs reductions using wide selects that we can
+ // fix up. To do so, we also need a wide canonical IV to keep track of the
+ // indices of the max values.
+ for (auto &R : LoopRegion->getEntryBasicBlock()->phis()) {
+ // We need a wide canonical IV
+ if (auto *CurIV = dyn_cast<VPWidenIntOrFpInductionRecipe>(&R)) {
+ if (!CurIV->isCanonical())
+ continue;
+ WideIV = CurIV;
+ continue;
+ }
+
+ // And a single FMaxNoFMFs reduction phi.
+ // TODO: Support FMin reductions as well.
+ auto *CurRedPhiR = dyn_cast<VPReductionPHIRecipe>(&R);
+ if (!CurRedPhiR)
+ continue;
+ if (RedPhiR)
+ return false;
+ if (CurRedPhiR->getRecurrenceKind() != RecurKind::FMaxNoFMFs ||
+ CurRedPhiR->isInLoop() || CurRedPhiR->isOrdered())
+ continue;
+ RedPhiR = CurRedPhiR;
+
+ // MaxOp feeding the reduction phi must be a select (either wide or a
+ // replicate recipe), where the phi is the last operand, and the compare
+ // predicate is strict. This ensures NaNs won't get propagated unless the
+ // initial value is NaN
+ VPRecipeBase *Inc = RedPhiR->getBackedgeValue()->getDefiningRecipe();
+ auto *RepR = dyn_cast<VPReplicateRecipe>(Inc);
+ if (!isa<VPWidenSelectRecipe>(Inc) &&
+ !(RepR && (isa<SelectInst>(RepR->getUnderlyingInstr()))))
+ return false;
+
+ MinMaxOp = cast<VPRecipeWithIRFlags>(Inc);
+ auto *Cmp = cast<VPRecipeWithIRFlags>(MinMaxOp->getOperand(0));
+ if (MinMaxOp->getOperand(1) == RedPhiR ||
+ !CmpInst::isStrictPredicate(Cmp->getPredicate()))
+ return false;
+ }
+
+ // Nothing to do.
+ if (!RedPhiR)
+ return true;
+
+ // A wide canonical IV is currently required.
+ // TODO: Create an induction if no suitable existing one is available.
+ if (!WideIV)
+ return false;
----------------
fhahn wrote:
We need to check if there's a wide canonical induction, that is completely unconnected to the recurrence, so we would need to check all phi nodes. There are some other restrictions on the reductions that need to be checked to ensure the transform is legal. Even if we detect the full pattern early, we would still need to find the `MaxOp` and a sutable induction, so it would not save much work here.
Checking them directly when performing the transformation seems perferable and less fragile than checking in one place and applying the transform much later. It is also in line with general VPlan direction & initial design goal: reducing tight coupling between components by combining checks required for transforms directly in separate, de-coupled VPlan transforms.
https://github.com/llvm/llvm-project/pull/146711
More information about the llvm-commits
mailing list