[llvm] [AArch64][CostModel] Improve cost estimate of scalarizing a vector di… (PR #118055)
Sushant Gokhale via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 19 01:58:19 PST 2024
================
@@ -3472,6 +3472,20 @@ InstructionCost AArch64TTIImpl::getArithmeticInstrCost(
Cost *= 4;
return Cost;
} else {
+ // If the information about individual scalars being vectorized is
+ // available, this yeilds better cost estimation.
+ if (auto *VTy = dyn_cast<FixedVectorType>(Ty); VTy && !Args.empty()) {
+ InstructionCost InsertExtractCost =
+ ST->getVectorInsertExtractBaseCost();
+ Cost = (3 * InsertExtractCost) * VTy->getNumElements();
+ for (int i = 0, Sz = Args.size(); i < Sz; i += 2) {
+ Cost += getArithmeticInstrCost(
+ Opcode, VTy->getScalarType(), CostKind,
+ TTI::getOperandInfo(Args[i]), TTI::getOperandInfo(Args[i + 1]));
+ }
+ return Cost;
+ }
----------------
sushgokh wrote:
> 1. That's fine. If you think they should be vectorized, a new vectorization strategy should be implemented instead.
Don't you think lowering the division vector in some late IR pass is easy than stopping vectorization at div ? One more advantage of lowering later is lowering is transparent to the user and done at single place rather than at multiple places.
This type of vector lowering is already being done at other places e.g. masked_load
> 2\. No, it is just a simple legality check. If the instruction should be scalarized, it should be built into a build vector node; it should not be vectorized.
Just need a clarification: my understanding of `TreeEntry::NeedToGather` is you build a vector out of scalars. Is that correct?
Now, coming to the legality checks. Yes, in the end, its legality check but I need to check all the contraints, which are currently in `getArithmeticInstrCost` once again. And I am trying to avoid that route just for reasons mentioned in my comment on (1).
https://github.com/llvm/llvm-project/pull/118055
More information about the llvm-commits
mailing list