[llvm] [VectorCombine] New folding pattern for extract/binop/shuffle chains (PR #145232)

Luke Lau via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 21 00:07:57 PDT 2025


================
@@ -2988,6 +2989,305 @@ bool VectorCombine::foldShuffleFromReductions(Instruction &I) {
   return foldSelectShuffle(*Shuffle, true);
 }
 
+/// For a given chain of patterns of the following form:
+///
+/// ```
+///   %1 = shufflevector <n x ty1> %0, <n x ty1> poison <n x ty2> mask
+///
+///   %2 = tail call <n x ty1> llvm.<umin/umax/smin/smax>(<n x ty1> %0, <n x
+///   ty1> %1)
+///     OR
+///   %2 = add/mul/or/and/xor <n x ty1> %0, %1
+///
+///   %3 = shufflevector <n x ty1> %2, <n x ty1> poison <n x ty2> mask
+///   ...
+///   ...
+///   %(i - 1) = tail call <n x ty1> llvm.<umin/umax/smin/smax>(<n x ty1> %(i -
+///   3), <n x ty1> %(i - 2)
+///     OR
+///   %(i - 1) = add/mul/or/and/xor <n x ty1> %(i - 3), %(i - 2)
+///
+///   %(i) = extractelement <n x ty1> %(i - 1), 0
+/// ```
+///
+/// Where:
+///    `mask` follows a partition pattern:
+///
+/// Ex:
+///    [n = 8, p = poison]
+///
+///    4 5 6 7 | p p p p
+///    2 3 | p p p p p p
+///    1 | p p p p p p p
+///
+///    For powers of 2, there's a consistent pattern, but for other cases
+///    the parity of the current half value at each step decides the
+///    next partition half (see `ExpectedParityMask` for more logical details
+///    in generalising this).
+///
+/// Ex:
+///    [n = 6]
+///
+///    3 4 5 | p p p
+///    1 2 | p p p p
+///    1 | p p p p p
+bool VectorCombine::foldShuffleChainsToReduce(Instruction &I) {
+  // Going bottom-up for the pattern.
+  auto *EEI = dyn_cast<ExtractElementInst>(&I);
+  if (!EEI)
+    return false;
+
+  std::queue<Value *> InstWorklist;
+  InstructionCost OrigCost = 0;
+
+  Value *InitEEV = nullptr;
+
+  // Common instruction operation after each shuffle op.
+  unsigned int CommonCallOp = 0;
+  Instruction::BinaryOps CommonBinOp = Instruction::BinaryOpsEnd;
+
+  bool IsFirstCallOrBinInst = true;
+  bool ShouldBeCallOrBinInst = true;
----------------
lukel97 wrote:

If I'm understanding this right, a chain will always have `ceil(log2(num elts))` shufflevectors and bin ops each?

Instead of using a worklist and toggling between these two, would it be easier to just use a for loop with a fixed count? Something like

```c++
Instruction *BinOp = Extract.getOperand(0);
for (int i = 0; i < ceil(log2(num elts)); i++) {
  // check for binary op/binary intrinsic
  if (!isa<BinOp>(BinOp) || ...)
    return false;
  Instruction *Shuffle = BinOp.getOperand(1);
  BinOp = BinOp.getOperand(0);
  // check for shuffle
  if (!isa<Shuffle>(Shuffle) || Shuffle->getOperand(0) != BinOp || ...)
    return false;
}
```

https://github.com/llvm/llvm-project/pull/145232


More information about the llvm-commits mailing list