[Mlir-commits] [mlir] [mlir][scf] Fix `FoldTensorCastOfOutputIntoForallOp` for multi-result scf.forall (PR #173271)

Longsheng Mou llvmlistbot at llvm.org
Mon Dec 22 17:30:45 PST 2025


================
@@ -1986,6 +1986,15 @@ struct FoldTensorCastOfOutputIntoForallOp
     if (tensorCastProducers.empty())
       return failure();
 
+    llvm::SmallMapVector<Operation *, int64_t, 2> yieldOpToIterArgsIndex;
+    for (auto [index, iterArg] :
+         llvm::enumerate(forallOp.getRegionIterArgs())) {
+      for (Operation *user : iterArg.getUsers()) {
+        if (isa<ParallelCombiningOpInterface>(user))
----------------
CoTinker wrote:

Actually, multiple `tensor.parallel_insert_slice` ops updating the same init argument do not affect the replacement. If a single `tensor.parallel_insert_slice` per init argument is required, I think this should be enforced in the verifier.

https://github.com/llvm/llvm-project/pull/173271


More information about the Mlir-commits mailing list