[Mlir-commits] [mlir] [mlir][PartialReductionTilingInterface] Generalize implementation of `tileUsingSCF` for `ReductionTilingStrategy::PartialOuterReduction`. (PR #143467)

Mon Sep 22 07:23:52 PDT 2025

PietroGhg wrote:

> > Hi @MaheshRavishankar, I have a question about a change you made in this PR: you added a check for parallel dimensions and `reductionStrategy != ReductionTilingStrategy::FullReduction` (`mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp` line 202 in this diff), I have a usage of `PartialTilingInterface` that has both parallel and reduction iteration types, which now fails due to this check, and I was wondering the rationale behind it. Can you explain why it's needed? Thank you :)
> 
> Could you please give me more info about your use case? From my understanding I think those are really two separate "transformation steps". So any case of doing both would do parallel tiling first and then do partial reduction tiling. With tiling the parallel dimensions together with the reduction dimensions for partial tiling some things get a bit tricky. How do you control where to insert the new parallel dimension created due to partial reduction tiling, for example. We could probably combine the two, but wanted to first "cleanup" what is in tree first before adding more features to it. I am happy to iterate on your use case if there is a gap because of this change?

Thanks @MaheshRavishankar for the quick reply. So in my use case I have a "reduction" op that takes in a 3-D tensor where the outermost dim acts as a batching dim, and performs a reduction on the columns (adding up the values), e.g. `myred(%arg) : tensor<10x128x512> -> tensor<10x512>`. Note that the op doesn't implement destination-passing style, so I couldn't figure out how to implement tiling for this op using the `FullReduction` tiling strategy, and I implemented it using the `partialReductionTilingInterface`, so the batching and column dims are "parallel", while the row dim is a reduction dim. I also implemented tiling similarly for a `matmul`  op (again, it's a custom op that doesn't implement destination-passing style), and the batching, M, and N dims are parallel, while the K dim is a reduction dim.

https://github.com/llvm/llvm-project/pull/143467