[llvm] [LV] Stop using the legacy cost model for udiv + friends (PR #152707)

David Sherwood via llvm-commits llvm-commits at lists.llvm.org
Fri Aug 8 05:52:42 PDT 2025


https://github.com/david-arm created https://github.com/llvm/llvm-project/pull/152707

In VPWidenRecipe::computeCost for the instructions udiv, sdiv, urem and srem we fall back on the legacy cost unnecessarily. At this point we know that the vplan must be functionally correct, i.e. if the divide/remainder is not safe to speculatively execute then we must have either:

1. Scalarised the operation, in which case we wouldn't be using a VPWidenRecipe, or
2. We've inserted a select for the second operand to ensure we don't fault through divide-by-zero.

For 2) it's necessary to add the select operation to VPInstruction::computeCost so that we mirror the cost of the legacy cost model. The only problem with this is that we also generate selects in vplan for predicated loops with reductions, which *aren't* accounted for in the legacy cost model. In order to prevent asserts firing I've also added the selects to precomputeCosts to ensure the legacy costs match the vplan costs for reductions.

>From 21ead7a9daadac61c03f791949f9b1307bc8ab6d Mon Sep 17 00:00:00 2001
From: David Sherwood <david.sherwood at arm.com>
Date: Fri, 8 Aug 2025 12:50:28 +0000
Subject: [PATCH] [LV] Stop using the legacy cost model for udiv + friends

In VPWidenRecipe::computeCost for the instructions udiv, sdiv,
urem and srem we fall back on the legacy cost unnecessarily. At
this point we know that the vplan must be functionally correct, i.e.
if the divide/remainder is not safe to speculatively execute then
we must have either:

1. Scalarised the operation, in which case we wouldn't be using
a VPWidenRecipe, or
2. We've inserted a select for the second operand to ensure we
don't fault through divide-by-zero.

For 2) it's necessary to add the select operation to
VPInstruction::computeCost so that we mirror the cost of the
legacy cost model. The only problem with this is that we also
generate selects in vplan for predicated loops with reductions,
which *aren't* accounted for in the legacy cost model. In order
to prevent asserts firing I've also added the selects to
precomputeCosts to ensure the legacy costs match the vplan costs
for reductions.
---
 llvm/lib/Transforms/Vectorize/LoopVectorize.cpp |  3 +++
 llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp  | 15 +++++++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index be00fd6a416e5..e79ae67b09a12 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -4337,6 +4337,9 @@ VectorizationFactor LoopVectorizationPlanner::selectVectorizationFactor() {
           if (!VPI)
             continue;
           switch (VPI->getOpcode()) {
+          // Selects are not modelled in the legacy cost model if they are
+          // inserted for reductions.
+          case Instruction::Select:
           case VPInstruction::ActiveLaneMask:
           case VPInstruction::ExplicitVectorLength:
             C += VPI->cost(VF, CostCtx);
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index e971ba1aac15c..d0406214a6364 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -952,6 +952,15 @@ InstructionCost VPInstruction::computeCost(ElementCount VF,
   }
 
   switch (getOpcode()) {
+  case Instruction::Select: {
+    // TODO: It may be possible to improve this by analyzing where the
+    // condition operand comes from.
+    CmpInst::Predicate Pred = CmpInst::BAD_ICMP_PREDICATE;
+    auto *CondTy = toVectorTy(Ctx.Types.inferScalarType(getOperand(0)), VF);
+    auto *VecTy = toVectorTy(Ctx.Types.inferScalarType(getOperand(1)), VF);
+    return Ctx.TTI.getCmpSelInstrCost(Instruction::Select, VecTy, CondTy, Pred,
+                                      Ctx.CostKind);
+  }
   case Instruction::ExtractElement:
   case VPInstruction::ExtractLane: {
     // Add on the cost of extracting the element.
@@ -2007,8 +2016,10 @@ InstructionCost VPWidenRecipe::computeCost(ElementCount VF,
   case Instruction::SDiv:
   case Instruction::SRem:
   case Instruction::URem:
-    // More complex computation, let the legacy cost-model handle this for now.
-    return Ctx.getLegacyCost(cast<Instruction>(getUnderlyingValue()), VF);
+    // If the div/rem operation isn't safe to speculate and requires
+    // predication, then the only way we can even create a vplan is to insert
+    // a select on the second input operand to ensure we use the value of 1
+    // for the inactive lanes. The select will be costed separately.
   case Instruction::Add:
   case Instruction::FAdd:
   case Instruction::Sub:



More information about the llvm-commits mailing list