[Mlir-commits] [mlir] [mlir][scf] Extend consumer fuse to nested loop structure (PR #94190)
llvmlistbot at llvm.org
llvmlistbot at llvm.org
Thu Aug 29 01:39:52 PDT 2024
================
@@ -1464,12 +1464,36 @@ checkAssumptionForFusingConsumer(tensor::InsertSliceOp candidateSliceOp) {
/// failure otherwise.
static FailureOr<OpOperand *> getConsumerFromUses(Value val,
Block *containingOpBlock) {
- // Step 1. Check that the value has exactly one use.
- if (!llvm::hasSingleElement(val.getUses()))
- return failure();
+ // Step 1. Check that the value has exactly one use excluding `insertSliceOp`
+ // or `ParallelInsertSliceOp`.
+ OpOperand *operand = nullptr;
----------------
Yun-Fly wrote:
FYI: It is very interesting that fusing consumer into nest loop structure will **certainly** lead to multiple uses. Considering below example:
```
%1 = scf.forall() {
extract ...
%2 = scf.for() {
tiledProducer
insert_slice
yield
}
scf.forall.in_parallel {
tensor.parallel_insert_slice %2
}
}
%3 = consumer ins(%1)
```
After first iteration of fusion, the resultant IR turns out like:
```
%1:2 = scf.forall() {
extract ...
%2 = scf.for() {
tiledProducer
insert_slice
yield
}
// current upstream implementation will create a pair of `insert_slice` and `extract_slice`
%4 = insert_slice %2
%5 = extract_slicce %4
%6 = tiledConsumer ins(%2)
scf.forall.in_parallel {
tensor.parallel_insert_slice %6
tensor.parallel_insert_slice %2
}
}
%3 = consumer ins(%1#2)
```
As you see, if we want to furtherly fuse `tiledConsumer` into inner loop, we must deal with multiple uses because `%2` actually has three users.
Although we have added something to support multiple uses in downstream, I would not upstream formal solution in this patch. Instead, use this workaround to cover simple use case at this moment.
https://github.com/llvm/llvm-project/pull/94190
More information about the Mlir-commits
mailing list