[llvm] [AArch64][CostModel] Lower cost of dupq (SVE2.1) (PR #144918)
Sander de Smalen via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 20 06:59:12 PDT 2025
================
@@ -5583,6 +5583,26 @@ InstructionCost AArch64TTIImpl::getShuffleCost(
Kind = TTI::SK_PermuteSingleSrc;
}
+ // Segmented shuffle matching.
+ if (ST->hasSVE2p1() && CostKind == TTI::TCK_RecipThroughput &&
+ Kind == TTI::SK_PermuteSingleSrc && isa<FixedVectorType>(Tp) &&
+ Tp->getPrimitiveSizeInBits().isKnownMultipleOf(128)) {
+
+ FixedVectorType *VTy = cast<FixedVectorType>(Tp);
+ unsigned Segments = VTy->getPrimitiveSizeInBits() / 128;
+ unsigned SegmentElts = VTy->getNumElements() / Segments;
+
+ // dupq zd.t, zn.t[idx]
+ unsigned Lane = (unsigned)Mask[0];
+ if (SegmentElts * Segments == Mask.size() && Lane < SegmentElts) {
+ bool IsDupQ = true;
+ for (unsigned I = 1; I < Mask.size(); ++I)
+ IsDupQ &= (unsigned)Mask[I] == Lane + ((I / SegmentElts) * SegmentElts);
+ if (IsDupQ)
+ return LT.first;
+ }
----------------
sdesmalen-arm wrote:
nit: to benefit from an early exit, you could use something like this:
```suggestion
if (all_of(enumerate(Mask), [](unsigned I, unsigned M) { return M == Lane + ((I / SegmentElts) * SegmentElts); })
return ..
```
?
https://github.com/llvm/llvm-project/pull/144918
More information about the llvm-commits
mailing list