[llvm] [AArch64] Add costs for ST3 and ST4 instructions, modelled as store(shuffle). (PR #87934)
David Green via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 8 05:50:43 PDT 2024
================
@@ -3815,18 +3815,30 @@ InstructionCost AArch64TTIImpl::getSpliceCost(VectorType *Tp, int Index) {
return LegalizationCost * LT.first;
}
-InstructionCost AArch64TTIImpl::getShuffleCost(TTI::ShuffleKind Kind,
- VectorType *Tp,
- ArrayRef<int> Mask,
- TTI::TargetCostKind CostKind,
- int Index, VectorType *SubTp,
- ArrayRef<const Value *> Args) {
+InstructionCost AArch64TTIImpl::getShuffleCost(
+ TTI::ShuffleKind Kind, VectorType *Tp, ArrayRef<int> Mask,
+ TTI::TargetCostKind CostKind, int Index, VectorType *SubTp,
+ ArrayRef<const Value *> Args, const Instruction *CxtI) {
std::pair<InstructionCost, MVT> LT = getTypeLegalizationCost(Tp);
+
// If we have a Mask, and the LT is being legalized somehow, split the Mask
// into smaller vectors and sum the cost of each shuffle.
if (!Mask.empty() && isa<FixedVectorType>(Tp) && LT.second.isVector() &&
Tp->getScalarSizeInBits() == LT.second.getScalarSizeInBits() &&
Mask.size() > LT.second.getVectorNumElements() && !Index && !SubTp) {
+
+ // Check for ST3/ST4 instructions, which are represented in llvm IR as
+ // store(interleaving-shuffle). The shuffle cost could potentially be free,
+ // but we model it with a cost of LT.first so that LD3/LD3 have a higher
+ // cost than just the store.
+ if ((ShuffleVectorInst::isInterleaveMask(
+ Mask, 4, Tp->getElementCount().getKnownMinValue() * 2) ||
+ ShuffleVectorInst::isInterleaveMask(
+ Mask, 3, Tp->getElementCount().getKnownMinValue() * 2)) &&
+ !ShuffleVectorInst::isZeroEltSplatMask(
+ Mask, Tp->getElementCount().getKnownMinValue()))
+ return LT.first;
----------------
davemgreen wrote:
Ah, yeah! I rewrote the code when moving it into this outer if statement, and must have missed that part of the original.
https://github.com/llvm/llvm-project/pull/87934
More information about the llvm-commits
mailing list