[llvm] 26324bc - [VPlan] Move FOR splice cost into VPInstruction::FirstOrderRecurrenceSplice (#129645)

via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 14 00:33:35 PDT 2025


Author: Luke Lau
Date: 2025-03-14T15:33:32+08:00
New Revision: 26324bc1bf397453ce966f56a88245263f7beec5

URL: https://github.com/llvm/llvm-project/commit/26324bc1bf397453ce966f56a88245263f7beec5
DIFF: https://github.com/llvm/llvm-project/commit/26324bc1bf397453ce966f56a88245263f7beec5.diff

LOG: [VPlan] Move FOR splice cost into VPInstruction::FirstOrderRecurrenceSplice (#129645)

After #124093 we now support fixed-order recurrences with EVL tail
folding by replacing VPInstruction::FirstOrderRecurrenceSplice with a VP
splice intrinsic.

However the costing for the splice is currently done in
VPFirstOrderRecurrencePHIRecipe, so when we add the VP splice intrinsic
we end up costing it twice.

This fixes it by splitting out the cost for the splice into
FirstOrderRecurrenceSplice so that it's not duplicated when we replace
it.

We still have to keep the VF=1 checks in VPFirstOrderRecurrencePHIRecipe
since the splice might end up dead and discarded, e.g. in the test
@pr97452_scalable_vf1_for.

Added: 
    

Modified: 
    llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
    llvm/test/Transforms/LoopVectorize/RISCV/vplan-vp-intrinsics-fixed-order-recurrence.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 09cc31b80f139..6e396eda6aac6 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -772,6 +772,16 @@ InstructionCost VPInstruction::computeCost(ElementCount VF,
     return Cost + Ctx.TTI.getVectorInstrCost(Instruction::ExtractElement, VecTy,
                                              Ctx.CostKind);
   }
+  case VPInstruction::FirstOrderRecurrenceSplice: {
+    assert(VF.isVector() && "Scalar FirstOrderRecurrenceSplice?");
+    SmallVector<int> Mask(VF.getKnownMinValue());
+    std::iota(Mask.begin(), Mask.end(), VF.getKnownMinValue() - 1);
+    Type *VectorTy = toVectorTy(Ctx.Types.inferScalarType(this), VF);
+
+    return Ctx.TTI.getShuffleCost(TargetTransformInfo::SK_Splice,
+                                  cast<VectorType>(VectorTy), Mask,
+                                  Ctx.CostKind, VF.getKnownMinValue() - 1);
+  }
   default:
     // TODO: Compute cost other VPInstructions once the legacy cost model has
     // been retired.
@@ -3494,14 +3504,7 @@ VPFirstOrderRecurrencePHIRecipe::computeCost(ElementCount VF,
   if (VF.isScalable() && VF.getKnownMinValue() == 1)
     return InstructionCost::getInvalid();
 
-  SmallVector<int> Mask(VF.getKnownMinValue());
-  std::iota(Mask.begin(), Mask.end(), VF.getKnownMinValue() - 1);
-  Type *VectorTy =
-      toVectorTy(Ctx.Types.inferScalarType(this->getVPSingleValue()), VF);
-
-  return Ctx.TTI.getShuffleCost(TargetTransformInfo::SK_Splice,
-                                cast<VectorType>(VectorTy), Mask, Ctx.CostKind,
-                                VF.getKnownMinValue() - 1);
+  return 0;
 }
 
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)

diff  --git a/llvm/test/Transforms/LoopVectorize/RISCV/vplan-vp-intrinsics-fixed-order-recurrence.ll b/llvm/test/Transforms/LoopVectorize/RISCV/vplan-vp-intrinsics-fixed-order-recurrence.ll
index 3fed9e4956107..e4c8586bded73 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/vplan-vp-intrinsics-fixed-order-recurrence.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/vplan-vp-intrinsics-fixed-order-recurrence.ll
@@ -51,7 +51,8 @@ define void @first_order_recurrence(ptr noalias %A, ptr noalias %B, i64 %TC) {
 ; IF-EVL-NEXT:   EMIT vp<[[RESUME_EXTRACT:%.+]]> = extract-from-end ir<[[LD]]>, ir<1>
 ; IF-EVL-NEXT:   EMIT branch-on-cond ir<true>
 ; IF-EVL-NEXT: Successor(s): ir-bb<for.end>, scalar.ph
-
+; IF-EVL: Cost of 0 for VF vscale x 4: FIRST-ORDER-RECURRENCE-PHI ir<[[FOR_PHI]]> = phi ir<33>, ir<[[LD]]>
+; IF-EVL: Cost of 4 for VF vscale x 4: WIDEN-INTRINSIC vp<[[SPLICE]]> = call llvm.experimental.vp.splice(ir<[[FOR_PHI]]>, ir<[[LD]]>, ir<-1>, ir<true>, vp<[[PREV_EVL]]>, vp<[[EVL]]>)
 entry:
   br label %for.body
 


        


More information about the llvm-commits mailing list