<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/130002>130002</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[mlir] Inconsistent output when executing MLIR program with `linalg-specialize-generic-ops`
</td>
</tr>
<tr>
<th>Labels</th>
<td>
mlir
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Emilyaxe
</td>
</tr>
</table>
<pre>
git version: 953838d
system: `Ubuntu 18.04.6 LTS`
## Description:
I am experiencing an inconsistent result when executing the same MLIR program with and without `linalg-specialize-generic-ops`.
## Steps to Reproduce:
### 1. **MLIR Program (a.mlir)**:
a.mlir:
```
module {
func.func private @printMemrefI32(tensor<*xi32>)
func.func private @printMemrefF32(tensor<*xf32>)
func.func @main() -> () {
%arg0 = index.constant 0
%6 = "tosa.const"() <{value = dense<-132> : tensor<1x2x1xi32>}> : () -> tensor<1x2x1xi32>
%11 = tosa.cast %6 : (tensor<1x2x1xi32>) -> tensor<1x2x1xf32>
%15 = "tosa.const"() <{value = dense<0> : tensor<1x2x1xi32>}> : () -> tensor<1x2x1xi32>
%16 = tosa.while_loop (%arg1 = %15) : (tensor<1x2x1xi32>) -> tensor<1x2x1xi32> {
%40 = "tosa.const"() <{value = dense<3> : tensor<1x2x1xi32>}> : () -> tensor<1x2x1xi32>
%41 = tosa.greater %40, %arg1 : (tensor<1x2x1xi32>, tensor<1x2x1xi32>) -> tensor<1x2x1xi1>
%extracted = tensor.extract %41[%arg0, %arg0, %arg0] : tensor<1x2x1xi1>
%from_elements = tensor.from_elements %extracted : tensor<i1>
tosa.yield %from_elements : tensor<i1>
} do {
^bb0(%arg1: tensor<1x2x1xi32>):
%40 = tosa.sin %11 : (tensor<1x2x1xf32>) -> tensor<1x2x1xf32>
%41 = tosa.slice %11 {size = array<i64: 1, 5, 6>, start = array<i64: 0, 9, 25>} : (tensor<1x2x1xf32>) -> tensor<1x5x6xf32>
%42 = tosa.erf %11 : (tensor<1x2x1xf32>) -> tensor<1x2x1xf32>
%43 = tosa.reverse %11 {axis = 0 : i32} : (tensor<1x2x1xf32>) -> tensor<1x2x1xf32>
%44 = tosa.greater_equal %41, %41 : (tensor<1x5x6xf32>, tensor<1x5x6xf32>) -> tensor<1x5x6xi1>
%45 = "tosa.const"() <{value = dense<1> : tensor<1x2x1xi32>}> : () -> tensor<1x2x1xi32>
%46 = tosa.add %arg1, %45 : (tensor<1x2x1xi32>, tensor<1x2x1xi32>) -> tensor<1x2x1xi32>
tosa.yield %46 : tensor<1x2x1xi32>
}
%17 = tosa.clamp %16 {max_fp = 1.07374182E+9 : f32, max_int = 1073741823 : i64, min_fp = -1.07374182E+9 : f32, min_int = -1073741824 : i64} : (tensor<1x2x1xi32>) -> tensor<1x2x1xi32>
%18 = tosa.clamp %16 {max_fp = 1.07374182E+9 : f32, max_int = 1073741823 : i64, min_fp = -1.07374182E+9 : f32, min_int = -1073741824 : i64} : (tensor<1x2x1xi32>) -> tensor<1x2x1xi32>
%19 = tosa.sub %17, %18 : (tensor<1x2x1xi32>, tensor<1x2x1xi32>) -> tensor<1x2x1xi32>
%cast19 = tensor.cast %19 : tensor<1x2x1xi32> to tensor<*xi32>
call @printMemrefI32(%cast19) : (tensor<*xi32>) -> ()
return
}
}
```
### 2. **Command to Run without `linalg-specialize-generic-ops` :**
```
/data/szy/MLIR/llvm-release/llvm-project/install/mlir-opt /data/szy/workspace/mlir-inconsistent/a.mlir -pass-pipeline="builtin.module(func.func(tosa-to-linalg-named))" \
| /data/szy/MLIR/llvm-release/llvm-project/install/mlir-opt -tosa-to-scf \
| /data/szy/MLIR/llvm-release/llvm-project/install/mlir-opt -pass-pipeline="builtin.module(func.func(tosa-to-linalg))" \
| /data/szy/MLIR/llvm-release/llvm-project/install/mlir-opt -tosa-to-tensor -tosa-to-arith -convert-scf-to-cf -convert-math-to-llvm \
--linalg-fuse-elementwise-ops --cse --linalg-generalize-named-ops -convert-arith-to-llvm \
-one-shot-bufferize="bufferize-function-boundaries" --linalg-fold-unit-extent-dims -finalize-memref-to-llvm --expand-strided-metadata \
-convert-linalg-to-affine-loops -convert-cf-to-llvm -convert-index-to-llvm -finalize-memref-to-llvm -lower-affine -convert-scf-to-cf \
-convert-arith-to-llvm -finalize-memref-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts \
| timeout 10 /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-cpu-runner -e main -entry-point-result=void \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_c_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_async_runtime.so
```
### 3. **Output without `linalg-specialize-generic-ops` :**:
```
[[[0],
[0]]]
```
### 4. **Command to Run with `linalg-specialize-generic-ops` :**
```
/data/szy/MLIR/llvm-release/llvm-project/install/mlir-opt /data/szy/workspace/mlir-inconsistent/a.mlir -pass-/data/szy/MLIR/llvm-release/llvm-project/install/mlir-opt /data/szy/workspace/mlir-inconsistent/a.mlir -pass-pipeline="builtin.module(func.func(tosa-to-linalg-named))" \
| /data/szy/MLIR/llvm-release/llvm-project/install/mlir-opt -tosa-to-scf \
| /data/szy/MLIR/llvm-release/llvm-project/install/mlir-opt -pass-pipeline="builtin.module(func.func(tosa-to-linalg))" \
| /data/szy/MLIR/llvm-release/llvm-project/install/mlir-opt -tosa-to-tensor -tosa-to-arith -convert-scf-to-cf -convert-math-to-llvm \
--linalg-fuse-elementwise-ops --cse --linalg-generalize-named-ops -convert-arith-to-llvm \
-one-shot-bufferize="bufferize-function-boundaries" --linalg-fold-unit-extent-dims -finalize-memref-to-llvm --expand-strided-metadata \
--linalg-specialize-generic-ops -convert-linalg-to-affine-loops -convert-cf-to-llvm -convert-index-to-llvm -finalize-memref-to-llvm -lower-affine -convert-scf-to-cf \
-convert-arith-to-llvm -finalize-memref-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts \
| timeout 10 /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-cpu-runner -e main -entry-point-result=void \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_c_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_async_runtime.so
```
### 5. **Output with `linalg-specialize-generic-ops` :**
```
[[[3],
[3]]]
```
### 6. **Analysis for this case :**
This MLIR program is expected to correctly output` [0, 0]` for `%19 = tosa.sub %17, %18`, given that `%17 `and `%18` are both equal to `%16`. However, instead of the expected result, it incorrectly outputs `[3, 3]`, which is the value of `%16.`
To debug this issue, I printed the IR after each pass and found that the input IR [input.txt](https://github.com/user-attachments/files/19103618/input.txt) is correct before applying the `--linalg-specialize-generic-ops `pass. As shown in the first image,` %reinterpret_cast_24` (the final result) is stored the value of ` %9`, which is a constant with the value 0. However, after running` --linalg-specialize-generic-ops` [output.txt](https://github.com/user-attachments/files/19103628/output.txt) , in the second image the` linalg.generic` operation is mistakenly optimized into` linalg.copy`, propagating the value 3 from` %reinterpret_cast_19` to `%reinterpret_cast_24`, ultimately leading to the incorrect final result.


</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzsWt1u4zjSfRrmhqAhUT-2L3LhOG18AWbwLXpmrwNaKtnckUgtSSV2P_2iSEmWEzvpznQPBrsNGIl-yKpTp4o8pCRhrdwpgFuS3ZHs_kZ0bq_N7adG1kdxgJutLo-3O-noExgrtSLJii6zZJEsShKtSLSyR-ugwcskj_657ZTraLyYReksp7_8_hvJo9CO8ITwhN6DLYxsXTBFotUDFQ2FQwtGgiqk2lGhqFSFVlZaB8pRA7arHX3eg6JwgKJz2MrtgVrRAP31l4fPtDV6Z0RDn6XbU6FKf6A7h6BqqUS9Y7aFQopafgG2AwVGFky3luTRrAc4YvzNQWup0_QztEaXXQEB6tgCG8UzSviK8JX3_4_eP-ELMWtqaQhfhtuha38RWYpWyIn_4Umjy64GSuZ3JFrRqlPFDP_Q1sgn4YCSNGqNVO5XaAxUDwknfOFAWW1IsiZ8dZAJJ8kn9Bet6LsGNq8NVFcMkDRqhFSELwhfUkaST7Q_7sFSSgnPhNlFlCT3VKoSDjNMnBPK0ejUJPf3CedOWxFaEM4HY8mazO-eRN2Bb1aCskCSNYs9LoqkjXjjAz_EQ8jz--H-FOPltiOWOPZeAhJh3YDPG7nc94rl6qXl7CNhRj8ixvwU4_Ne1vBYa90GC5ivuAeaxVmA9u3Byz45oRIIz9LoI9En3zt6H386yfHOgHBgAkTC1_REwVtRr694uYYhfgEBDs6IwkEZkPjWs_5igIgTbhg9J1Rnh9n9RWZeeqqMbh6hhgaUs1NvL26cY5rYPbfoSTtKqMtLxq90I_N7WurpvJB92m6jU8FdTzFfhhnylLzolDwr1ThmL2Wr-voB-qowbC0LGKzP76z8EkpTGCOOGF-eotMYU5Lhn7wvDOuEcZea-uwt8Q_PQvV-M-zskF-CzU-wwVTfl5LkZNsAqvyEFHGQoaQi7w1T9oGgrjhOXw3SR_h3J-p-eISRkF6Kc8LS-UCd3rhM7qvhk35o1o5_zLw1mbdFWQ5T1UBF9p3nrNH_-ZhP8zciG8f7RG3mE0WtRdMOEjS_a8ThsWr97XgWzZN5Gi_4J8LvguhgqviaYiupwpCKh1ZJqLg89S2kGuywtwxJNRpio6V0tHS1dr-WqD7gxf9awMvJtNltQ9L7qvRk_JCqDL5xjTb4D7o2rNri5fU6xdX7pVVyMFuIur64sB79XVoWnS22p-thtGrAdUZR7yCMDvyNB9MVf_jR01aCD1uJtW4a3Lzg1qNT37KHQbD9fuOCQ74phROEb-yXI-Eb3LEQvqnrp4YZqEFYGE5bo_8FhSN8I3EZX9eEb3DnwnSLnJ_bedbmD9uKAoZG020b4Zuw6aGsFdayVrZQSwUkuSecbztZO6lmYfdD-GLcdiDj2grmNOuDVqKB0u-nloRzSrK1p3b9Cs9H42KDQ1tU9EfY_3ME_CWhhyo_nQuDO2nKCq2ewDikBi8jQeO1Rri9h1k_NT02NiSt6iywfuH4LC1gnVLGCgt0bONLOFSzz3FoM1j3CF6a1wqY3WvHtl1VgZFfRjr7U4YUOqkV2-pOlcJIsEjdCZiuS9Yp6RgcsE5ZKRtLKWUV3kcLjZ8QRs-MwaEVqmTWGVlCyRpwAskfSmVE3HtA9qpKKmC465pEVJyMni76XfPk-lUYtX4G01u-nJiXeM4ZvGp5YgzJm2AxUGhVyBpYpwz4ziXDCdJOStHJBnCWiqOPVSWOhBL_SzVUZtF2zHRKgaEMaCOkogyUM0fWaqkcC0-ESHL_pGU5Vp7dCwMlq-XW-qL4E1hquQ1_Ec5j8RjAPHZO1nZm9Y93-Zc7FPaofJyYzpnVV3Rr1Kxk0Kz_71zbOTqKFf1GuUouKlZ2F364CyZ8HWS1Pw2_d_Clb2jqVyF8pah_R1H9W4D4qew_lf2_VNnZm9MEfU_5v0n6r-vz-9L_pvB_u_L_FP6fwn9NWLMXwu8F9c9vUQfBT84FPzkT_Kug8gHUSon6aKWllTbU7aWlhbDw0vXveOPsvaG0_h2kfzjuNC20MVC4-ki1j9L7xOUHX1O_Bskj7wHhvPd4xLdZ0518AkXdXrih1xwPcGnSn2NLKgzQrXZ7Gh6FOj3czUkezej_6Wd4AoMGcc4HUVJd-XehI_x-jGAL51-knodivUFklq9pEmLBw-e9LPbIAxoLDzx1NTqfhfe4v2tawrbbBWqltR1g3wfqn6Qgd3ugD5-pqBwYCqLYU1RI_0a2wlk7EICtpMLqefiMvPrjmTs4n_zF3rnW-pRtCN_spNt321mhG8I3ncWJ0DlR7P0rCcI3laxRCDbxMo6SPF54ORzs8SWG1HNAt1BpA1S0bX0c3iGTPHpvmid5hFHM6MpSu9fPikrl-1bSWEdlI3ZIgy8SnhlAKkxrwD3ilPnI03BnEbooUY858uis06anbso72lq-TI6g4_tVP_BOnc6LIyQAJxOpdmjsnRj7Ag8l8n0ywTETE4N8SUPZhlf3qDBl4I7iFUQQIM56XHhFt2AEij4G30jrxB-gsJZbJxtUJSqV05O-hW6PPWmt0a3YifFrgcBTQiujm2u5ipHx06i7nEs03tVONsJBfaQ1iNL70H1hD-U2zfXwkQGPSXb34CvmAwwLayEcRCIpszhlMIeMpVVeMVGVOSvyeb7N4sVynkF4MvkdnSZ5HImorFhVJui0rNiCJwWLKp4lyTwTIpmPTm_K26RcJktxA7fxPI15Op_P-c3-FqJtPIeyKHmUzOfpokyKKI2KdAHLNBNFeiNvecSzKInyaJ7lMZ-VfFku40jEyTYqI16QNIJGyHqGAjfTZnfjZ6LbOImiiN_UYgu19R-0cB6-xeAku78xt14Qt93OkjSqpXX2ZMFJV_uPYHyH7J4-TD9B0b3WnX-C8vrLk6-QwZvO1LdvJAARXVjgY4B-bIUYn275fwIAAP__uEHqbw">