[Mlir-commits] [mlir] [mlir][scf] Extend consumer fuse to nested loop structure (PR #94190)

llvmlistbot at llvm.org llvmlistbot at llvm.org
Thu Jun 6 10:03:19 PDT 2024


MaheshRavishankar wrote:

Lets start with your example above. I think your input is

```
%0 = linalg.fill .. outs(%empty)
%1 = linalg.matmul ... outs(%0)
%2 = linalg.add (..., %1)
```

You can first tile the `linalg.matmul`


```
%0 = linalg.fill ... outs(%empty)
%1 = scf.forall ... shared_outs(%arg0 = %0) {
   %2 = tensor.extract_slice %arg0[...]
   %3 = linalg.matmul ... outs(%2)
   scf.forall.in_parallel {
       tensor.insert_in_parallel %3 into %arg0
   }
}
%2 = linalg.add (.., %1)
```

You can fuse the `fill` and the `add` in to get

```
%0 = scf.forall ... shared_outs(%arg0 = %empty) {
    %1 = tensor.extract_slice %arg0
    %2 = linalg.fill ... outs(%1)
    %3 = linalg.matmul ... outs(%2)
    %4 = linalg.add
    scf.forall.in_parallel {
       tensor.insert_in_parallel %4 into %arg0
   }
}
```

Now you can apply the same two steps again for the second level of tiling and use scf.for instead. Doesnt that give you what you are looking for?


https://github.com/llvm/llvm-project/pull/94190


More information about the Mlir-commits mailing list