<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/139221>139221</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[MLIR] Inconsistent output when executing MLIR program with and without `-convert-affine-for-to-gpu`
</td>
</tr>
<tr>
<th>Labels</th>
<td>
mlir
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Lambor24
</td>
</tr>
</table>
<pre>
My git version is [145aa66](https://github.com/llvm/llvm-project/commit/145aa66f689c24c0cf2fffd995ba83678cfaa310).
## Description:
I am experiencing an inconsistent result when executing the same MLIR program with and without the `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`.
## Steps to Reproduce:
### 1. **MLIR Program (test.mlir)**:
test.mlir:
```
module {
memref.global "private" constant @__constant_4xi16 : memref<4xi16> = dense<-1> {alignment = 64 : i64}
func.func private @printMemrefI16(memref<*xi16>) attributes {llvm.emit_c_interface}
func.func @main() {
%c-1_i16 = arith.constant -1 : i16
%c0_i16 = arith.constant 0 : i16
%alloc = memref.alloc() {alignment = 64 : i64} : memref<i16>
affine.store %c0_i16, %alloc[] : memref<i16>
affine.for %arg0 = 0 to 4 {
%0 = affine.load %alloc[] : memref<i16>
%1 = arith.addi %0, %c-1_i16 : i16
affine.store %1, %alloc[] : memref<i16>
}
%expand_shape = memref.expand_shape %alloc [] output_shape [1] : memref<i16> into memref<1xi16>
%cast = memref.cast %expand_shape : memref<1xi16> to memref<*xi16>
call @printMemrefI16(%cast) : (memref<*xi16>) -> ()
return
}
}
```
### 2. **Command to Run Without `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**
```
/path/llvm-project/build/bin/mlir-opt test.mlir -lower-affine -gpu-lower-to-nvvm-pipeline | \
/path/llvm-project/build/bin/mlir-runner -e main -entry-point-result=void \
-shared-libs=/path/llvm-project/build/lib/libmlir_runner_utils.so \
-shared-libs=/path/llvm-project/build/lib/libmlir_c_runner_utils.so \
-shared-libs=/path/llvm-project/build/lib/libmlir_cuda_runtime.so
```
### 3. **Output Without `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**
```
[-4]
```
### 4. **Command to Run With `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**
```
/path/llvm-project/build/bin/mlir-opt test.mlir -pass-pipeline="builtin.module(func.func(convert-affine-for-to-gpu{gpu-block-dims=1 gpu-thread-dims=0}))" | \
/path/llvm-project/build/bin/mlir-opt -lower-affine -gpu-lower-to-nvvm-pipeline | \
/path/llvm-project/build/bin/mlir-runner -e main -entry-point-result=void \
-shared-libs=/path/llvm-project/build/lib/libmlir_runner_utils.so \
-shared-libs=/path/llvm-project/build/lib/libmlir_c_runner_utils.so \
-shared-libs=/path/llvm-project/build/lib/libmlir_cuda_runtime.so
```
### 5. **Output With `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**
```
[-1]
```
I'm not sure if there is any bug in my program or if the wrong usage of the above passes caused this result.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzsV01v4zYT_jX0ZSCBoj4sHXyw4zdAgF28xfbQo0FJI5mtRAr8cJJ_X5CSHafOZptiu2iBAoI-yOHzzAxnRhxujOgl4obkO5LvV9zZo9KbT3yslWbZqlbt8-bzM_TCwgm1EUqCMEDyXZLlnBcFyfeElUdrJ0PSLWH3hN33wh5dHTdqJOx-GE7nRzRp9Ss2lrD7Ro2j8C8LTFeUVcOyhjYd67quraq85mVarMum4zxNKGFVTOjWXywlLIU9mkaLyQolPTHdPgAfAZ8m1AJlI2QPXIKQjZJGGIvSgkbjBguPR5SAT9g466XsEcHwEeHzp4cvMGnVaz7Co7BH4LINL8rZIEYKGjVKnlDbiHedkBh1SkdWRf3kyHrXTy6qB9X8FrViNCTdU_BD9qiRt-exhKz3pKCvrfnZ4mTAKviCk1ata3A26iLhhZIYCNsStg2a_rRoSlhp0dh4HIQmrJolLqtfpi54BV0uuh1V6wYEst4RugUYcdTYxf2gaj4AYWzS4sQtEsbA-9FyaYFk9HA4fx2yJ5EUQNLtspikd2GIpP8Dku6hRWmQpHdREkbWOz6IXo5-O_x0kYW1osi8V7wOnZNN7G-wkHvCSQtpPweCh6QgrLyQEbZd6AirgFurRe0sGk_lYy7GUdhDcxDSou54g7c8JKMjF5Kw0kOcfQFAWN5EyWG2bw9cC3uML26IklnzpLgSp1-TprfCfBhUE2QXt4eBFy3ecdRrfy_mz8BzWMbGKo0vKhF2d6GcM_3bEJ3SYY3uadCA-vDMrvwTrJjnliWD4u2f5wnrkytv8bYVAXNR98X71667tTH5kHnnAAj0-DRx2R7MkU94vRevxy-bNWMrZydnz5P5LvkKIQhp1ctg8vRaD5Y33Nhr1vn7RqvtLQZcA1_lwAzd8GF4M20WzhBh6Ra-nkhRyNYQiwumRuu09B-zA5f7VTF5VavYuVbdqXH0ddTXNifhl6Wcfs9SGn484bopcITdT9web39AtRND658-8e99gYzUZOFSLiEa1CPqRTfwWi0jVkXy5LHEhIOfI-s7IPndB9m0kxI1RAi--ECE0urnaFJC2mj-UZF0f1KiXbAjc-Qa22gQtTf9m1SDqOe7ZzvMbAdnxWBio74LZvP3oLqWe2ArRoyNei_G0nOM_T8k5A-PrXwXZf78846K2Ttp8C_IgYkbcwn0sJfML7JCxvPRgbDy8iMlrPyQNckb1lBvjT_EVP7M8dcyy9vwX_b-47M3fyN7f2TqJm-m7gNh6xGksmCcRhCdP_j7FwNcPkPtehASxudLo6D0IgSPWskenOE9gpqHeK1OCD6L0EDDncEW7FGYpRWJV-0mbau04ivcJOusWK_XVVatjhue1m3XJRVP07rLS075mpU8q8uyKtiaNyuxYZTlNKcVLWmeVTFLixTbtKZ51aZdUpGM4sjFEIdTsNL9ShjjcJOkFWPJauA1Dia0fYzNrQPzDaDehH2uXW9IRgdhrHlBsMIOoVX0DYg_9Dxct1fzseiP7dX7XdX7-13QldPD5sPdZbDU-O5yNva0Yb8HAAD__8nKlWw">