[llvm] [SLP]Better cost estimation for masked gather or "clustered" loads. (PR #105858)
Alexey Bataev via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 30 10:53:06 PDT 2024
================
@@ -4790,105 +4790,174 @@ BoUpSLP::LoadsState BoUpSLP::canVectorizeLoads(
}
}
}
- auto CheckForShuffledLoads = [&, &TTI = *TTI](Align CommonAlignment) {
+ // Correctly identify compare the cost of loads + shuffles rather than
+ // strided/masked gather loads. Returns true if vectorized + shuffles
+ // representation is better than just gather.
+ auto CheckForShuffledLoads = [&, &TTI = *TTI](Align CommonAlignment,
+ bool ProfitableGatherPointers) {
+ // Compare masked gather cost and loads + insert subvector costs.
+ TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput;
+ auto [ScalarGEPCost, VectorGEPCost] =
+ getGEPCosts(TTI, PointerOps, PointerOps.front(),
+ Instruction::GetElementPtr, CostKind, ScalarTy, VecTy);
+ // Estimate the cost of masked gather GEP. If not a splat, roughly
+ // estimate as a buildvector, otherwise estimate as splat.
+ if (static_cast<unsigned>(count_if(
+ PointerOps, IsaPred<GetElementPtrInst>)) < PointerOps.size() - 1 ||
+ any_of(PointerOps, [&](Value *V) {
+ return getUnderlyingObject(V) !=
+ getUnderlyingObject(PointerOps.front());
+ }))
+ VectorGEPCost += TTI.getScalarizationOverhead(
+ VecTy, APInt::getAllOnes(VecTy->getElementCount().getKnownMinValue()),
----------------
alexey-bataev wrote:
Yep, will fix
https://github.com/llvm/llvm-project/pull/105858
More information about the llvm-commits
mailing list