[llvm] [SLP][AMDGPU] Vectorize operands of non-trivially-vectorizable intrinsic calls (PR #189784)

Fri Apr 3 14:46:09 PDT 2026

================
@@ -29477,6 +29507,28 @@ bool SLPVectorizerPass::vectorizeChainsInBlock(BasicBlock *BB, BoUpSLP &R) {
       PostProcessCmps.insert(cast<CmpInst>(&*It));
   }
 
+  // Collect operands of non-trivially vectorizable intrinsic calls (e.g.,
+  // llvm.amdgcn.exp2) and group by intrinsic ID, so their operands can be
+  // vectorized independently.
+  // FIXME: Extend for all non-vectorized functions.
+  SmallMapVector<std::pair<Intrinsic::ID, unsigned>, SmallVector<Value *, 4>, 4>
+      OpcodeGroups;
+
+  for (Instruction &I : *BB) {
+    if (R.isDeleted(&I))
+      continue;
+    SmallVector<Value *, 4> Ops =
+        getNonTriviallyVectorizableIntrinsicCallOperand(&I);
+    if (!Ops.empty()) {
+      Intrinsic::ID ID = cast<CallInst>(&I)->getIntrinsicID();
+      for (Value *Op : Ops)
+        if (auto *OpI = dyn_cast<Instruction>(Op))
+          OpcodeGroups[{ID, OpI->getOpcode()}].push_back(Op);
----------------
alexey-bataev wrote:

Shall the groups be built by the argument number? I think same operands more often can build a homogenous list of instructions

https://github.com/llvm/llvm-project/pull/189784