[llvm] [LoopVectorize] Add cost of generating tail-folding mask to the loop (PR #130565)
Florian Hahn via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 19 05:09:39 PDT 2025
================
@@ -5610,6 +5610,31 @@ InstructionCost LoopVectorizationCostModel::expectedCost(ElementCount VF) {
Cost += BlockCost;
}
+#ifndef NDEBUG
+ // TODO: We're effectively having to duplicate the code from
+ // VPInstruction::computeCost, which is ugly. This isn't meant to be a fully
+ // accurate representation of the cost of tail-folding - it exists purely to
+ // stop asserts firing when the legacy cost doesn't match the VPlan cost.
+ if (!VF.isScalar() && foldTailByMasking()) {
+ TailFoldingStyle Style = getTailFoldingStyle();
+ LLVMContext &Context = TheLoop->getHeader()->getContext();
+ Type *I1Ty = IntegerType::getInt1Ty(Context);
+ Type *IndTy = Legal->getWidestInductionType();
+ TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput;
+ if (Style == TailFoldingStyle::DataWithEVL) {
+ Type *I32Ty = IntegerType::getInt32Ty(Context);
+ IntrinsicCostAttributes Attrs(Intrinsic::experimental_get_vector_length,
+ I32Ty, {IndTy, I32Ty, I1Ty});
+ Cost += TTI.getIntrinsicInstrCost(Attrs, CostKind);
+ } else if (useActiveLaneMask(Style)) {
+ VectorType *RetTy = VectorType::get(I1Ty, VF);
+ IntrinsicCostAttributes Attrs(Intrinsic::get_active_lane_mask, RetTy,
+ {IndTy, IndTy});
+ Cost += TTI.getIntrinsicInstrCost(Attrs, CostKind);
----------------
fhahn wrote:
I am wondering if it would be cleaner to handle this in the caller of `expectedCost`, where we have the VPlans available. With that, we could just iterate over all recipes in the loop region and compute the costs for ActiveLaneMask/EVL using the VPlan-based cost model and add them to the cost returned by `expectedCost`?
This might be more scalable for future use-cases.
https://github.com/llvm/llvm-project/pull/130565
More information about the llvm-commits
mailing list