[llvm] [Analysis] Extend llvm.experimental.cttz.elts to type-based-cost (PR #184578)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 6 05:02:26 PST 2026
================
@@ -3413,6 +3384,53 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
InstructionCost getVectorSplitCost() const { return 1; }
+ // TODO: The costs below reflect the expansion code in
+ // SelectionDAGBuilder, but we may want to sacrifice some accuracy in
+ // favour of compile time.
+ // This path should only be taken if Targets cannot custom lower this
+ // intrinsic.
+ InstructionCost getCttzEltsCost(const IntrinsicCostAttributes &ICA,
+ bool ZeroIsPoison,
+ TTI::TargetCostKind CostKind) const {
+ const IntrinsicInst *I = ICA.getInst();
+ Type *ArgTy = ICA.getArgTypes()[0];
+ EVT ArgType = getTLI()->getValueType(DL, ArgTy, true);
+ Type *RetTy = ICA.getReturnType();
+ FastMathFlags FMF = ICA.getFlags();
+
+ // Find the smallest "sensible" element type to use for the expansion.
+ ConstantRange VScaleRange(APInt(64, 1), APInt::getZero(64));
+ if (isa<ScalableVectorType>(ArgTy) && I && I->getCaller())
+ VScaleRange = getVScaleRange(I->getCaller(), 64);
+
+ unsigned EltWidth = getTLI()->getBitWidthForCttzElements(
+ RetTy, ArgType.getVectorElementCount(), ZeroIsPoison, &VScaleRange);
+ Type *NewEltTy = IntegerType::getIntNTy(RetTy->getContext(), EltWidth);
+
+ // Create the new vector type & get the vector length
+ Type *NewVecTy =
+ VectorType::get(NewEltTy, cast<VectorType>(ArgTy)->getElementCount());
+
+ IntrinsicCostAttributes StepVecAttrs(Intrinsic::stepvector, NewVecTy, {},
+ FMF);
+ InstructionCost Cost =
+ thisT()->getIntrinsicInstrCost(StepVecAttrs, CostKind);
+
+ Cost +=
+ thisT()->getArithmeticInstrCost(Instruction::Sub, NewVecTy, CostKind);
+ Cost += thisT()->getCastInstrCost(Instruction::SExt, NewVecTy, ArgTy,
+ TTI::CastContextHint::None, CostKind);
----------------
lukel97 wrote:
The cost for this sext is super high, like 10 for ArgTy = <2 x i32> because the dest type is smaller than the source type, i.e. ArgTy why <2 x i8>
Not a fault with this patch, but something we should definitely fix
https://github.com/llvm/llvm-project/pull/184578
More information about the llvm-commits
mailing list