[llvm] [AArch64][CostModel] Lower cost of dupq (SVE2.1) (PR #144918)

Sander de Smalen via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 20 06:59:12 PDT 2025


================
@@ -5583,6 +5583,26 @@ InstructionCost AArch64TTIImpl::getShuffleCost(
     Kind = TTI::SK_PermuteSingleSrc;
   }
 
+  // Segmented shuffle matching.
+  if (ST->hasSVE2p1() && CostKind == TTI::TCK_RecipThroughput &&
+      Kind == TTI::SK_PermuteSingleSrc && isa<FixedVectorType>(Tp) &&
+      Tp->getPrimitiveSizeInBits().isKnownMultipleOf(128)) {
+
+    FixedVectorType *VTy = cast<FixedVectorType>(Tp);
+    unsigned Segments = VTy->getPrimitiveSizeInBits() / 128;
+    unsigned SegmentElts = VTy->getNumElements() / Segments;
+
+    // dupq zd.t, zn.t[idx]
+    unsigned Lane = (unsigned)Mask[0];
+    if (SegmentElts * Segments == Mask.size() && Lane < SegmentElts) {
+      bool IsDupQ = true;
+      for (unsigned I = 1; I < Mask.size(); ++I)
+        IsDupQ &= (unsigned)Mask[I] == Lane + ((I / SegmentElts) * SegmentElts);
+      if (IsDupQ)
+        return LT.first;
+    }
----------------
sdesmalen-arm wrote:

nit: to benefit from an early exit, you could use something like this:
```suggestion
if (all_of(enumerate(Mask), [](unsigned I, unsigned M) { return M == Lane + ((I / SegmentElts) * SegmentElts); })
  return ..
```
?

https://github.com/llvm/llvm-project/pull/144918


More information about the llvm-commits mailing list