[Mlir-commits] [mlir] [MLIR][SCF] Add an API to fuse consumer to a producer within scf loop (PR #88712)

donald chen llvmlistbot at llvm.org
Tue May 14 01:08:36 PDT 2024


cxy-1993 wrote:

> Hi everyone.
> 
> This PR contains the following :-
> 
> 1. Base commit by @cxy-1993 .
> 2. Implementation of fusion of consumers - unifying + tests too.
> 3. Implementation of tiling for tensor.unpack as a consumer - tests too.
> 
> Hi @ftynse @nicolasvasilache - I've tried unifying the implementation of scf.for/scf.forall - could you please take a look and let me know your thoughts/any pointers?
> 
> > Thanks for this patch @Abhishek-Varma! As #89893 is closed, please cherry-pick the commit and re-submit the code.
> 
> Hi @cxy-1993 - Your commit has now been added to this PR branch. Please take a look.
> 
> > What if the insertSliceOp located in nested scf.for
> 
> Hi @Yun-Fly @MaheshRavishankar - turns out that this feature itself opens another can of worms. Traversing to find the consumer is just the easy part - the difficulty arises to adjust rest of the code to take care of op dominance issues and a lot more. I'd therefore be sticking with the current implementation and not make this PR more complex than what it already is. We should definitely revisit this if there are imminent use-cases for the same. As a pointer that the current implementation will indeed cleanly exit in the above case, please take a look at [this code block](https://github.com/llvm/llvm-project/blob/39381b4ff8caf62e129ca1d50eae74a84b962bc7/mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp#L1170-L1172). A couple of pointers to consider while implementing the above use-case :-
> 
> 1. While we're at it - the nested structure can just be any block of an `SCF` region. Even if we fixate on `scf.yield` - its parent can be anything `ForOp`, `IfOp`, etc.
> 2. Let's, for now, keep it for the following case :-
>    a. scf.for -> scf.for -> .... -> scf.for
>    a. scf.for -> scf.for -> .... -> scf.forall
> 3. We then need to clone the entire structure RIGHT before the consumer op. RATIONALE: The other operands of the consumer needs to dominate the tiled consumer, so the easiest way is to clone the loop structure right before the unfused consumer - in this case all of these loop structure would get cloned. In the current basic implementation this is exactly what we do.
> 4. Now, the old candidate slice should be replaced by the cloned loop's candidate slice - this again requires you to traverse through the loop structure to find out exactly which insert slice op is going to be the new "old" candidate slice op. For scf.for this is trivial. For scf.forall it creates yet another IR inference gymnastics.
> 
> I did try coding until this point but it needs further massaging/debugging - so have stashed it, will strongly suggest to keep this PR deal with the very basic producer-consumer use-case first and implement features as per usage arises.
> 
> Let me know your thoughts on this.

Thanks, this patch look good to me now.

https://github.com/llvm/llvm-project/pull/88712


More information about the Mlir-commits mailing list