[llvm] [VectorCombine] New folding pattern for extract/binop/shuffle chains (PR #145232)

Mon Jul 21 04:56:18 PDT 2025

================
@@ -2988,6 +2989,305 @@ bool VectorCombine::foldShuffleFromReductions(Instruction &I) {
   return foldSelectShuffle(*Shuffle, true);
 }
 
+/// For a given chain of patterns of the following form:
+///
+/// ```
+///   %1 = shufflevector <n x ty1> %0, <n x ty1> poison <n x ty2> mask
+///
+///   %2 = tail call <n x ty1> llvm.<umin/umax/smin/smax>(<n x ty1> %0, <n x
+///   ty1> %1)
+///     OR
+///   %2 = add/mul/or/and/xor <n x ty1> %0, %1
+///
+///   %3 = shufflevector <n x ty1> %2, <n x ty1> poison <n x ty2> mask
+///   ...
+///   ...
+///   %(i - 1) = tail call <n x ty1> llvm.<umin/umax/smin/smax>(<n x ty1> %(i -
+///   3), <n x ty1> %(i - 2)
+///     OR
+///   %(i - 1) = add/mul/or/and/xor <n x ty1> %(i - 3), %(i - 2)
+///
+///   %(i) = extractelement <n x ty1> %(i - 1), 0
+/// ```
+///
+/// Where:
+///    `mask` follows a partition pattern:
+///
+/// Ex:
+///    [n = 8, p = poison]
+///
+///    4 5 6 7 | p p p p
+///    2 3 | p p p p p p
+///    1 | p p p p p p p
+///
+///    For powers of 2, there's a consistent pattern, but for other cases
+///    the parity of the current half value at each step decides the
+///    next partition half (see `ExpectedParityMask` for more logical details
+///    in generalising this).
+///
+/// Ex:
+///    [n = 6]
+///
+///    3 4 5 | p p p
+///    1 2 | p p p p
+///    1 | p p p p p
+bool VectorCombine::foldShuffleChainsToReduce(Instruction &I) {
+  // Going bottom-up for the pattern.
+  auto *EEI = dyn_cast<ExtractElementInst>(&I);
+  if (!EEI)
+    return false;
+
+  std::queue<Value *> InstWorklist;
+  InstructionCost OrigCost = 0;
+
+  Value *InitEEV = nullptr;
+
+  // Common instruction operation after each shuffle op.
+  unsigned int CommonCallOp = 0;
+  Instruction::BinaryOps CommonBinOp = Instruction::BinaryOpsEnd;
+
+  bool IsFirstCallOrBinInst = true;
+  bool ShouldBeCallOrBinInst = true;
----------------
Rajveer100 wrote:

I mean to say, currently the work list allows us to handle multiple push backs for instructions above. Whereas without it we would need two separate variables to keep track, something like:

```
Value *CurInst = PrevVecV[2], *NextInst = nullptr;
// InstWorklist.push(PrevVecV[2]);
```

When we reach:

```c++
InstWorklist.push(PrevVecV[1]);
InstWorklist.push(PrevVecV[0]);
```

CurInst = PrevVecV[1];
NextInst = PrevVecV[0];

then we come back to the top we need to check if `NextInst` is null or not and re assign it to CurInst.

Don't you think its much simpler to handle with the way it's done currently, its unnecessarily adding more variables, we already have quite some before the loop?

https://github.com/llvm/llvm-project/pull/145232