[llvm] [VectorCombine] foldShuffleOfBinops - fold shuffle(binop(shuffle(x),shuffle(z)),binop(shuffle(y),shuffle(w)) -> binop(shuffle(x,z),shuffle(y,w)) (PR #120984)
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Sat Dec 28 03:37:41 PST 2024
================
@@ -1723,6 +1730,36 @@ bool VectorCombine::foldShuffleOfBinops(Instruction &I) {
TTI.getShuffleCost(TargetTransformInfo::SK_PermuteTwoSrc, BinResTy,
OldMask, CostKind, 0, nullptr, {LHS, RHS}, &I);
+ // Handle shuffle(binop(shuffle(x),y),binop(z,shuffle(w))) style patterns
+ // where one use shuffles have gotten split across the binop/cmp. These
+ // often allow a major reduction in total cost that wouldn't happen as
+ // individual folds.
+ auto MergeInner = [&](Value *&Op, int Offset, MutableArrayRef<int> Mask,
+ TTI::TargetCostKind CostKind) -> bool {
+ Value *InnerOp;
+ ArrayRef<int> InnerMask;
+ if (match(Op, m_OneUse(m_Shuffle(m_Value(InnerOp), m_Undef(),
+ m_Mask(InnerMask)))) &&
+ all_of(InnerMask,
+ [NumSrcElts](int M) { return M < (int)NumSrcElts; }) &&
+ InnerOp->getType() == Op->getType()) {
+ for (int &M : Mask)
+ if (Offset <= M && M < (int)(Offset + NumSrcElts)) {
+ M = InnerMask[M - Offset];
+ M = 0 <= M ? M + Offset : M;
----------------
RKSimon wrote:
Yes, we've replaced the original M with the InnerMask value to fold the shuffles together - and InnerMask could contain PoisonMaskElem
https://github.com/llvm/llvm-project/pull/120984
More information about the llvm-commits
mailing list