[llvm] [AArch64][CostModel] Improve cost estimate of scalarizing a vector di… (PR #118055)
Alexey Bataev via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 17 09:36:40 PST 2024
================
@@ -3472,6 +3472,20 @@ InstructionCost AArch64TTIImpl::getArithmeticInstrCost(
Cost *= 4;
return Cost;
} else {
+ // If the information about individual scalars being vectorized is
+ // available, this yeilds better cost estimation.
+ if (auto *VTy = dyn_cast<FixedVectorType>(Ty); VTy && !Args.empty()) {
+ InstructionCost InsertExtractCost =
+ ST->getVectorInsertExtractBaseCost();
+ Cost = (3 * InsertExtractCost) * VTy->getNumElements();
+ for (int i = 0, Sz = Args.size(); i < Sz; i += 2) {
+ Cost += getArithmeticInstrCost(
+ Opcode, VTy->getScalarType(), CostKind,
+ TTI::getOperandInfo(Args[i]), TTI::getOperandInfo(Args[i + 1]));
+ }
+ return Cost;
+ }
----------------
alexey-bataev wrote:
Then, you should not build the vector node at all. I would suggest instead that you check how the `getScalarsVectorizationState` function in SLP works, add a check for NEON division (maybe add a new entry in TTI to check if the vector operation is legal and won't be scalarized), and return that for NEON, it should build the `TreeEntry::NeedToGather` node.
https://github.com/llvm/llvm-project/pull/118055
More information about the llvm-commits
mailing list