[Mlir-commits] [mlir] [mlir][scf] Extend fuse producer to multi-level candidates case (PR #97803)

Wed Sep 18 23:09:25 PDT 2024

================
@@ -949,6 +949,145 @@ mlir::scf::tileAndFuseProducerOfSlice(
                                            tileAndFuseResult->tiledOps};
 }
 
+/// Get the real producer from candidate ExtractSliceOp
+///
+/// ```
+/// %0 = producer
+/// %1 = scf.for(%arg1 = %0)
+///   %2 = extract %arg1
+///   %3 = scf.for(%arg2 = %2)
+///      %4 = extract %args2
+///      ...
+/// ```
+///
+/// @param candidateSliceOp: %4 = extract %args2
+/// @param backwardSlice: in-out parameter populated by backward extractSliceOps
+/// @return OpResult Producer : %0 = producer
+static FailureOr<OpResult> getRealProducerFromExtractSliceOp(
----------------
MaheshRavishankar wrote:

Thanka for the explanation. I see the issues a bit more clearly now. I havent reviewed the code yet. I dont know if this has been rebased on top of what was submitted. 
But here is the complexity that I am concerned about. In you example, you need to somewhere keep track of the sequence of extract slices that you need to walk to get the producer because the actual offset and size you need is obtained by "combining" the offsets and sizes of all the slices to get the "real offset" and size.  Also I am not convinced that fusing the first extract slice + producer and then doing the second extract slice + producer is not feasible. That should be always possible.

So if you start with this

```
%0 = producerOp
scf.for(%1=%0)
   %2 = extract_slice %1
   scf.for(%3=%2)
      %4 = extract_slice %3
      ... 
      yield %x
```

you should always be able to do

```
scf.for(%1=%0)
   %2 = tiled producer 1
   scf.for(%3=%2)
      %4 = extract_slice %3
      ... 
      yield %x
```

and then you do 

```
scf.for(%1=%0)
   scf.for(%3=%2)
      %4 = tiled producer 2
      ... 
      yield %x
```

The amount of state you need to carry during the transformation to fuse with one extract slice is prohibitively high IMO. That will make it hard to change/fix the transformation. 

https://github.com/llvm/llvm-project/pull/97803