[Mlir-commits] [mlir] [mlir][scf] Fix `FoldTensorCastOfOutputIntoForallOp` for multi-result scf.forall (PR #173271)
Longsheng Mou
llvmlistbot at llvm.org
Mon Dec 22 17:30:45 PST 2025
================
@@ -1986,6 +1986,15 @@ struct FoldTensorCastOfOutputIntoForallOp
if (tensorCastProducers.empty())
return failure();
+ llvm::SmallMapVector<Operation *, int64_t, 2> yieldOpToIterArgsIndex;
+ for (auto [index, iterArg] :
+ llvm::enumerate(forallOp.getRegionIterArgs())) {
+ for (Operation *user : iterArg.getUsers()) {
+ if (isa<ParallelCombiningOpInterface>(user))
----------------
CoTinker wrote:
Actually, multiple `tensor.parallel_insert_slice` ops updating the same init argument do not affect the replacement. If a single `tensor.parallel_insert_slice` per init argument is required, I think this should be enforced in the verifier.
https://github.com/llvm/llvm-project/pull/173271
More information about the Mlir-commits
mailing list