[llvm] [SLP][AMDGPU] Vectorize operands of non-trivially-vectorizable intrinsic calls (PR #189784)

Sun Apr 5 18:59:48 PDT 2026

================
@@ -29477,6 +29507,28 @@ bool SLPVectorizerPass::vectorizeChainsInBlock(BasicBlock *BB, BoUpSLP &R) {
       PostProcessCmps.insert(cast<CmpInst>(&*It));
   }
 
+  // Collect operands of non-trivially vectorizable intrinsic calls (e.g.,
+  // llvm.amdgcn.exp2) and group by intrinsic ID, so their operands can be
+  // vectorized independently.
+  // FIXME: Extend for all non-vectorized functions.
+  SmallMapVector<std::pair<Intrinsic::ID, unsigned>, SmallVector<Value *, 4>, 4>
+      OpcodeGroups;
+
+  for (Instruction &I : *BB) {
----------------
mssefat wrote:

I have separated the collection of the seeds intentionally so that we can collect full set of candidates first and group before vectorize. The tryToVectorizeList may delete IR so running it while iterating may miss some vectorization.

https://github.com/llvm/llvm-project/pull/189784