[llvm] [VectorCombine] New folding pattern for extract/binop/shuffle chains (PR #145232)
Rajveer Singh Bharadwaj via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 21 01:45:27 PDT 2025
================
@@ -2988,6 +2989,305 @@ bool VectorCombine::foldShuffleFromReductions(Instruction &I) {
return foldSelectShuffle(*Shuffle, true);
}
+/// For a given chain of patterns of the following form:
+///
+/// ```
+/// %1 = shufflevector <n x ty1> %0, <n x ty1> poison <n x ty2> mask
+///
+/// %2 = tail call <n x ty1> llvm.<umin/umax/smin/smax>(<n x ty1> %0, <n x
+/// ty1> %1)
+/// OR
+/// %2 = add/mul/or/and/xor <n x ty1> %0, %1
+///
+/// %3 = shufflevector <n x ty1> %2, <n x ty1> poison <n x ty2> mask
+/// ...
+/// ...
+/// %(i - 1) = tail call <n x ty1> llvm.<umin/umax/smin/smax>(<n x ty1> %(i -
+/// 3), <n x ty1> %(i - 2)
+/// OR
+/// %(i - 1) = add/mul/or/and/xor <n x ty1> %(i - 3), %(i - 2)
+///
+/// %(i) = extractelement <n x ty1> %(i - 1), 0
+/// ```
+///
+/// Where:
+/// `mask` follows a partition pattern:
+///
+/// Ex:
+/// [n = 8, p = poison]
+///
+/// 4 5 6 7 | p p p p
+/// 2 3 | p p p p p p
+/// 1 | p p p p p p p
+///
+/// For powers of 2, there's a consistent pattern, but for other cases
+/// the parity of the current half value at each step decides the
+/// next partition half (see `ExpectedParityMask` for more logical details
+/// in generalising this).
+///
+/// Ex:
+/// [n = 6]
+///
+/// 3 4 5 | p p p
+/// 1 2 | p p p p
+/// 1 | p p p p p
+bool VectorCombine::foldShuffleChainsToReduce(Instruction &I) {
+ // Going bottom-up for the pattern.
+ auto *EEI = dyn_cast<ExtractElementInst>(&I);
+ if (!EEI)
+ return false;
+
+ std::queue<Value *> InstWorklist;
+ InstructionCost OrigCost = 0;
+
+ Value *InitEEV = nullptr;
+
+ // Common instruction operation after each shuffle op.
+ unsigned int CommonCallOp = 0;
+ Instruction::BinaryOps CommonBinOp = Instruction::BinaryOpsEnd;
+
+ bool IsFirstCallOrBinInst = true;
+ bool ShouldBeCallOrBinInst = true;
----------------
Rajveer100 wrote:
How would we iterate, we would still need extra variables to keep track of the order?
Ex:
```c++
InstWorklist.push(PrevVecV[1]);
InstWorklist.push(PrevVecV[0]);
```
And even if we use `isa<>`, the switch for intrinsic/binop would be different and still need different blocks for each.
https://github.com/llvm/llvm-project/pull/145232
More information about the llvm-commits
mailing list