[llvm] [LV] Vectorize histogram operations (PR #99851)
Graham Hunter via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 20 06:39:31 PDT 2024
================
@@ -6519,8 +6520,33 @@ LoopVectorizationCostModel::getInstructionCost(Instruction *I,
// We've proven all lanes safe to speculate, fall through.
[[fallthrough]];
case Instruction::Add:
+ case Instruction::Sub: {
+ auto Info = Legal->getHistogramInfo(I);
+ if (Info && VF.isVector()) {
+ const HistogramInfo *HGram = Info.value();
+ // Assume that a non-constant update value (or a constant != 1) requires
+ // a multiply, and add that into the cost.
+ InstructionCost MulCost = TTI::TCC_Free;
+ ConstantInt *RHS = dyn_cast<ConstantInt>(I->getOperand(1));
+ if (!RHS || RHS->getZExtValue() != 1)
+ MulCost = TTI.getArithmeticInstrCost(Instruction::Mul, VectorTy);
+
+ // Find the cost of the histogram operation itself.
+ Type *PtrTy = VectorType::get(HGram->Load->getPointerOperandType(), VF);
+ Type *ScalarTy = I->getType();
+ Type *MaskTy = VectorType::get(Type::getInt1Ty(I->getContext()), VF);
+ IntrinsicCostAttributes ICA(Intrinsic::experimental_vector_histogram_add,
+ Type::getVoidTy(I->getContext()),
+ {PtrTy, ScalarTy, MaskTy});
+
+ // Add the costs together with the add/sub operation.
+ return TTI.getIntrinsicInstrCost(
+ ICA, TargetTransformInfo::TCK_RecipThroughput) +
+ MulCost + TTI.getArithmeticInstrCost(I->getOpcode(), VectorTy);
+ }
+ [[fallthrough]];
+ }
----------------
huntergr-arm wrote:
Yes, that's still needed for now -- the two cost models disagree and assert if I remove that code.
https://github.com/llvm/llvm-project/pull/99851
More information about the llvm-commits
mailing list