<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/61825>61825</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
mlir/Affine: sibling loop fusion pass missing opportunity when memrefs are allocated in function scope
</td>
</tr>
<tr>
<th>Labels</th>
<td>
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
rohany
</td>
</tr>
</table>
<pre>
The following mlir file has two functions with the same loop body, one of which gets successfully fused:
```
func.func @f1(%input : memref<10xf32>, %output : memref<10xf32>, %reduc : memref<10xf32>) {
%zero = arith.constant 0. : f32
%one = arith.constant 1. : f32
affine.for %i = 0 to 10 {
%0 = affine.load %input[%i] : memref<10xf32>
%2 = arith.addf %0, %one : f32
affine.store %2, %output[%i] : memref<10xf32>
}
affine.for %i = 0 to 10 {
%0 = affine.load %input[%i] : memref<10xf32>
%1 = affine.load %reduc[0] : memref<10xf32>
%2 = arith.addf %0, %1 : f32
affine.store %2, %reduc[0] : memref<10xf32>
}
return
}
func.func @f2() {
%input = memref.alloc() : memref<10xf32>
%output = memref.alloc() : memref<10xf32>
%reduc = memref.alloc() : memref<10xf32>
%zero = arith.constant 0. : f32
%one = arith.constant 1. : f32
affine.for %i = 0 to 10 {
%0 = affine.load %input[%i] : memref<10xf32>
%2 = arith.addf %0, %one : f32
affine.store %2, %output[%i] : memref<10xf32>
}
affine.for %i = 0 to 10 {
%0 = affine.load %input[%i] : memref<10xf32>
%1 = affine.load %reduc[0] : memref<10xf32>
%2 = arith.addf %0, %1 : f32
affine.store %2, %reduc[0] : memref<10xf32>
}
return
}
```
Running with `./bin/mlir-opt ../testing-2.mlir -pass-pipeline='builtin.module(func.func(affine-loop-fusion{mode=sibling}))'` yields:
```
module {
func.func @f1(%arg0: memref<10xf32>, %arg1: memref<10xf32>, %arg2: memref<10xf32>) {
%cst = arith.constant 0.000000e+00 : f32
%cst_0 = arith.constant 1.000000e+00 : f32
affine.for %arg3 = 0 to 10 {
%0 = affine.load %arg0[%arg3] : memref<10xf32>
%1 = arith.addf %0, %cst_0 : f32
affine.store %1, %arg1[%arg3] : memref<10xf32>
%2 = affine.load %arg0[%arg3] : memref<10xf32>
%3 = affine.load %arg2[0] : memref<10xf32>
%4 = arith.addf %2, %3 : f32
affine.store %4, %arg2[0] : memref<10xf32>
}
return
}
func.func @f2() {
%alloc = memref.alloc() : memref<10xf32>
%alloc_0 = memref.alloc() : memref<10xf32>
%alloc_1 = memref.alloc() : memref<10xf32>
%cst = arith.constant 0.000000e+00 : f32
%cst_2 = arith.constant 1.000000e+00 : f32
affine.for %arg0 = 0 to 10 {
%0 = affine.load %alloc[%arg0] : memref<10xf32>
%1 = arith.addf %0, %cst_2 : f32
affine.store %1, %alloc_0[%arg0] : memref<10xf32>
}
affine.for %arg0 = 0 to 10 {
%0 = affine.load %alloc[%arg0] : memref<10xf32>
%1 = affine.load %alloc_1[0] : memref<10xf32>
%2 = arith.addf %0, %1 : f32
affine.store %2, %alloc_1[0] : memref<10xf32>
}
return
}
}
```
When the memrefs involved are allocated as part of the function instead of arguments, the pass fails. There's no output from adding `-debug-only=affine-loop-fusion,loop-fusion-utils` that I see to further understand why the sibling fusion pass fails in this case.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWM1u4zYQfhr6MrBAkf49-JDENdBrsUCPC0oaSSwoUiCpuO7TF6SsWN7Yjp0s0B42MBRImv-Zjx8p4ZysNOKGzJ_JfDsRna-N3VhTC32YZKY4bL7VCKVRyuylrqBR0kIpFUItHPi9gbLTuZdGO9hLX4OvEZxoEJQxLQQLhL2A0QimhH0t8xoq9A5cl-foXNkpdYCyc1gQ_kToltDhuqDHX7wNbpJwATKjZUrYirC51G3ngfAnaLCxWBL-ktK_S84I_y24JWxuOv-hjMWiy6-KrIEsn_sgIEj_g9YA4VsQVvo6yY12XmgPNIkmgtpJOCR-QTZ9JyvKUmpMSmODmoxKFLyBlI79R6O0N9lrKCMKGGoR2sjmksy319IZ22Gj0ERRlNH2ULcY-FmIb0E6byxGA2dFvts5WW7_y7TTS3biDJD5M_0JtUvvr9wDbkdVs-g7q484GR5fwgmLOPlxggfYbI_eEqGUyQfR20GMEPU59QFsn9P-Bb5f4Ptfgu-crfrrH53WgTQjMZIFTQjbZVITtgs0OjWthyQ88-i81NWUJZFep61wbtrKFpXUSPiWsGXWSeWlThpTdAoJW73hnLBVn980MO607Jw0miyfG1MEXSczJXUVAmXr-FuSBYWDRFW425Tb-xpPwkUSFraiN-lV2Cr9SIDdxb6x-bnzl1cAGv-QsGdK3w9Br_mdXl4Rbuieo0TYil8HynWoxCpFpAQL98z5GC4XJ31I50Kwb6OejnvwuHv2E1PhV2yxe4EfrcwuFGTAM3_f9ffgn42H7t4l54T-c_yfvfqQemMCke4-x38n_eMYf8VC-gULX4MgexiC8CNXhRG8gsKrAIwpDmN7_8B9hEB2z9C9wbBv3mNhnE3f3XW4vRp9rRgXjH1PH4LxIwx-jb4f9HsdxCcWv0nnf9ao4_m29-JA6lejXrEAYRFiNMKHOwetsD4ceIP0cEAGqZ1HUYTnwlZdg9q7kEoQCpQPpZDKJfCtRouELR1oA8fddmlNA6IownaCLOi0wKyrpkarA-HbC_zPXkZ3085L5QLt-1p4-B0cYpiXsrO-RgudLtAGGBawrw_9Eb7fNkBvYBQeyFAD6SAXDpNJseHFmq_FBDfpYkVTzmZ8Mak3BV-vyixbZojZgqYrNheMC5bNeU7nBWUTuWGUccrZmnG2TBfJmoslL5e0mLOcFjwnM4qNkCpR6rVJjK0m0rkON4tgbKJEhsoNny3sJghNs65yZEaVdN6d1Lz0CjdhZ0XY7imWKozKkGH8TjFOs5HOhRembY31nZb-APvQ-qHt592W-tRhl5sWJ51Vm9r7Nm6v2I6wXSV93WVJbhrCdiGw479pa81fmHvCdjE5R9gu5vdvAAAA__-NrN3h">