<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/61604>61604</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
MLIR Affine Dialect Loop Fusion pass appears to not work
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
rohany
</td>
</tr>
</table>
<pre>
I’m playing around with the Affine dialect of MLIR, in particular the Loop Fusion pass. I haven’t been able to get it to work with `mlir-opt` on some simple examples, including the ones from the documentation page here: https://mlir.llvm.org/docs/Passes/#-affine-loop-fusion-fuse-affine-loop-nests.
At a high level, I’m doing the following:
* Copying some MLIR source with fusable loops into a file “testing.mlir”
* Running `bin/mlir-opt testing.mlir —affine-loop-fusion —dump-pass-pipeline`
Here is concrete input and output:
```
➜ build git:(main) ✗ cat ../testing.mlir
func.func @producer_consumer_fusion(%arg0: memref<10xf32>, %arg1: memref<10xf32>) {
%0 = memref.alloc() : memref<10xf32>
%1 = memref.alloc() : memref<10xf32>
%cst = arith.constant 0.000000e+00 : f32
affine.for %arg2 = 0 to 10 {
affine.store %cst, %0[%arg2] : memref<10xf32>
affine.store %cst, %1[%arg2] : memref<10xf32>
}
affine.for %arg2 = 0 to 10 {
%2 = affine.load %0[%arg2] : memref<10xf32>
%3 = arith.addf %2, %2 : f32
affine.store %3, %arg0[%arg2] : memref<10xf32>
}
affine.for %arg2 = 0 to 10 {
%2 = affine.load %1[%arg2] : memref<10xf32>
%3 = arith.mulf %2, %2 : f32
affine.store %3, %arg1[%arg2] : memref<10xf32>
}
return
}
func.func @sibling_fusion(%arg0: memref<10x10xf32>, %arg1: memref<10x10xf32>,
%arg2: memref<10x10xf32>, %arg3: memref<10x10xf32>,
%arg4: memref<10x10xf32>) {
affine.for %arg5 = 0 to 3 {
affine.for %arg6 = 0 to 3 {
%0 = affine.load %arg0[%arg5, %arg6] : memref<10x10xf32>
%1 = affine.load %arg1[%arg5, %arg6] : memref<10x10xf32>
%2 = arith.mulf %0, %1 : f32
affine.store %2, %arg3[%arg5, %arg6] : memref<10x10xf32>
}
}
affine.for %arg5 = 0 to 3 {
affine.for %arg6 = 0 to 3 {
%0 = affine.load %arg0[%arg5, %arg6] : memref<10x10xf32>
%1 = affine.load %arg2[%arg5, %arg6] : memref<10x10xf32>
%2 = arith.addf %0, %1 : f32
affine.store %2, %arg4[%arg5, %arg6] : memref<10x10xf32>
}
}
return
}
➜ build git:(main) ✗ bin/mlir-opt ../testing.mlir --affine-loop-fusion --dump-pass-pipeline
Pass Manager with 1 passes:
builtin.module(affine-loop-fusion{fusion-compute-tolerance=3.000000e-01 fusion-fast-mem-space=0 fusion-local-buf-threshold=0 fusion-maximal=false mode=producer})
module {
func.func @producer_consumer_fusion(%arg0: memref<10xf32>, %arg1: memref<10xf32>) {
%alloc = memref.alloc() : memref<10xf32>
%alloc_0 = memref.alloc() : memref<10xf32>
%cst = arith.constant 0.000000e+00 : f32
affine.for %arg2 = 0 to 10 {
affine.store %cst, %alloc[%arg2] : memref<10xf32>
affine.store %cst, %alloc_0[%arg2] : memref<10xf32>
}
affine.for %arg2 = 0 to 10 {
%0 = affine.load %alloc[%arg2] : memref<10xf32>
%1 = arith.addf %0, %0 : f32
affine.store %1, %arg0[%arg2] : memref<10xf32>
}
affine.for %arg2 = 0 to 10 {
%0 = affine.load %alloc_0[%arg2] : memref<10xf32>
%1 = arith.mulf %0, %0 : f32
affine.store %1, %arg1[%arg2] : memref<10xf32>
}
return
}
func.func @sibling_fusion(%arg0: memref<10x10xf32>, %arg1: memref<10x10xf32>, %arg2: memref<10x10xf32>, %arg3: memref<10x10xf32>, %arg4: memref<10x10xf32>) {
affine.for %arg5 = 0 to 3 {
affine.for %arg6 = 0 to 3 {
%0 = affine.load %arg0[%arg5, %arg6] : memref<10x10xf32>
%1 = affine.load %arg1[%arg5, %arg6] : memref<10x10xf32>
%2 = arith.mulf %0, %1 : f32
affine.store %2, %arg3[%arg5, %arg6] : memref<10x10xf32>
}
}
affine.for %arg5 = 0 to 3 {
affine.for %arg6 = 0 to 3 {
%0 = affine.load %arg0[%arg5, %arg6] : memref<10x10xf32>
%1 = affine.load %arg2[%arg5, %arg6] : memref<10x10xf32>
%2 = arith.addf %0, %1 : f32
affine.store %2, %arg4[%arg5, %arg6] : memref<10x10xf32>
}
}
return
}
}
```
A related bug is that I seem to be unable to set the value of the argument `mode` to the loop fusion pass. Even if I do `—affine-loop-fusion=“mode=greedy”`, the output from the pass manager always reports that `mode` is `producer`.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzUWd1O67oSfhr3ZpTKddrQXPQCKNVBWks6Wi-AnGSS-BzHjmwHFm-_ZSclgZayW2BvrQo1Te35-WbG048Jt1ZUCnFDVjdktZ3xztXabIyuuXqeZbp43tyTO0bWlKRpA63kz0JVwI3uVAFPwtXgaoTrshQKoRBcYu5Al_Dzx_0vwm5BKGi5cSLvJDdh7w-tW9h1Vmi_ZO0c7qHmj6he7DjIEBXwTCI4DRU6EM5_etLm_71RktBGChPp1pGEglZgdYNgRdNKBPzN_dX2DuSyK7zT3rhWaKE0ugl3hc67BpXjrnemQqjRIImvoXautSS-JmxH2M7bmkv52My1qQjbFTq3hO3-y631VnaExREPMYik1m1UBnj-gq--V2idnRO6JfS6f792wKEWVQ0SH1F6j6cBL_Te81JLqZ-EqrxTvQJ2Dbe6DQkJ6H3MwerO5NhHqexsCKK3bUEop4FDKSTC3sStQ-uEquYe4f7L7aj_V6eU108Smgk1RMJHHaaCL-qWh0EY14quaSOf8agVLUqhkCR0Gov_oEEQFnKtcoMOQai2c8BVAbpzbedG6P17Qoe__vaOkfSOpLcAWSdkAZUIEmzdcO97GlxJb0l6BTl3MJ8TtnsF44989eDLTuVz_wZkSVujiy5H85BrZbsGzUOfC8LWhK24qagv8QYbgyWJbxf0dxkzEt_56us3LN7bkAK5uulNgt9LgcTbYeecS6nzYCSFdxSMkouLJXPrgiw3wtVzj9Fx5YDOaXghYTeUBjVedJDrK3NeajNAZEEH9X1lQaeoXvZapw0OBofQUN8ne3Gy2n7g6ilFi3MUkavtJTAIW_Wrg5DUvLgABGGreBJvXhRlUD0gYW8jfQg7HgvrLNtfjPusmB_gbjr5CdyX5tug64waGtz-6yNH3opMClV9eND_zlmf7hnBHTaeAc7HFuJPWViekH7Vjg5qZDXWSHz0hI9bk3e3Ttrcm3p6Vc6rEW5yLMNHCmzSBg81Lz6tmR0pXrpvP4fFe1i-bJrBT3gzKedTJ_pPzhb70mztW-yl2Vp-fbZO9KFzeNcbEnlIwaIjPBqi6Ah3DLY9A4efXPEKTc95F-GfCrQvXNH75ISaN7roJBK2PjRArm4Gxp7rpu0cRk5LNFzlSOJtvOcVEV3Antlz66IGm8i2PGyi-xWpcy6jrCsjVxu0tZbFdLnhv0XDJYm3JZcWodGFF98zNh9alk6j27s9LfJ_her1zdiztMto2yj_cCFlvJz6ncceTrG23tuzWMSH6h7OpWPjoTwX2Lut8RJYYzc82rOOZOEwFIsLieH3ReHcbBzE4e0v7blxOJekTuMwbdKvlv4RqvhVhPBc2ncWlTiTTHwnnfhO-ncZAfw-CvimUE-f3j8_e5-kg5cRwu-jhKey907PGT-8GdL1U08wKLnDArKuAmHB1dzBPVjExmcyQ-jUfv5r0YUJ6COXHYIuww03VRjdhkmw51BJKAG_5IndQLeGAfPdIyoQJdxDob3AiXElibcvs9GBm1UGsXgeh6NJyEKYJoep5DhP9tagGdgol0_82YLBVhs3IJw4K6y_e-F9CR3GwrNiExdpnPIZbhbJVbpcsdXVelZvCs6XMV-VebpYL-MY0zJjtOR8lSyLq5ytZ2LDKItpzPwlWaznNFsuSp5mnOOalitOlhQbLuTLJHsmrO1wkywSupxJnqG04UEAYwqfICwS5n-GZmbjZaKsqyxZUimss6MWJ5zETZg-Dw8CtsODgLezfuBti9xYnyqlXRjozzojN6_H7ZVwdZfNc90QtvNmhkvUGv0_zB1hu-CcJWwXnP8rAAD__wxXxWE">