[llvm] [AArch64] Update cost model for extracting halves from 128+ bit vectors (PR #155601)
Paul Walker via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 27 07:56:07 PDT 2025
=?utf-8?q?Gaƫtan?= Bossu <gaetan.bossu at arm.com>
Message-ID:
In-Reply-To: <llvm.org/llvm/llvm-project/pull/155601 at github.com>
================
@@ -5750,11 +5750,13 @@ AArch64TTIImpl::getShuffleCost(TTI::ShuffleKind Kind, VectorType *DstTy,
Kind = improveShuffleKindFromMask(Kind, Mask, SrcTy, Index, SubTp);
bool IsExtractSubvector = Kind == TTI::SK_ExtractSubvector;
- // A subvector extract can be implemented with an ext (or trivial extract, if
- // from lane 0). This currently only handles low or high extracts to prevent
- // SLP vectorizer regressions.
+ // A subvector extract can be implemented with a NEON/SVE ext (or trivial
+ // extract, if from lane 0). This currently only handles low or high extracts
+ // to prevent SLP vectorizer regressions.
+ // Note that SVE's ext instruciton is destructive, but it can be fused with
+ // a movprfx to act like a constructive instruction.
if (IsExtractSubvector && LT.second.isFixedLengthVector()) {
- if (LT.second.is128BitVector() &&
+ if (LT.second.getFixedSizeInBits() >= AArch64::SVEBitsPerBlock &&
----------------
paulwalker-arm wrote:
Is `AArch64::SVEBitsPerBlock` the best option here?
For the NEON side of things to function correctly LT must be a 128-bit vector, so the code is just assuming `SVEBitsPerBlock` is 128, which of course it is but that's beside the point. Looking at the comment the intent is to ignore cases where the result would be an illegal type? so perhaps `!LT.second.is64BitVector()` or just `LT.second.getFixedSizeInBits() >= 128`?
FYI: `AArch64::SVEBitsPerBlock` is a concept that has lost its value. It is pretty much engrained in the implementation that SVE vectors are describe in multiples of NEON vectors.
https://github.com/llvm/llvm-project/pull/155601
More information about the llvm-commits
mailing list