<table border="1" cellspacing="0" cellpadding="8">

    <tr>

        <th>Issue</th>

        <td>

            <a href=https://github.com/llvm/llvm-project/issues/139221>139221</a>

        </td>

    </tr>

    <tr>

        <th>Summary</th>

        <td>

            [MLIR] Inconsistent output when executing MLIR program with and without `-convert-affine-for-to-gpu`

        </td>

    </tr>

    <tr>

      <th>Labels</th>

      <td>

            mlir

      </td>

    </tr>

    <tr>

      <th>Assignees</th>

      <td>

      </td>

    </tr>

    <tr>

      <th>Reporter</th>

      <td>

          Lambor24

      </td>

    </tr>

</table>

<pre>

    My git version is [145aa66](https://github.com/llvm/llvm-project/commit/145aa66f689c24c0cf2fffd995ba83678cfaa310).

## Description:

I am experiencing an inconsistent result when executing the same MLIR program with and without the `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`.

## Steps to Reproduce:

### 1. **MLIR Program (test.mlir)**:

test.mlir:

```

module {

  memref.global "private" constant @__constant_4xi16 : memref<4xi16> = dense<-1> {alignment = 64 : i64}

  func.func private @printMemrefI16(memref<*xi16>) attributes {llvm.emit_c_interface}

  func.func @main() {

    %c-1_i16 = arith.constant -1 : i16

    %c0_i16 = arith.constant 0 : i16

    %alloc = memref.alloc() {alignment = 64 : i64} : memref<i16>

    affine.store %c0_i16, %alloc[] : memref<i16>

    affine.for %arg0 = 0 to 4 {

      %0 = affine.load %alloc[] : memref<i16>

      %1 = arith.addi %0, %c-1_i16 : i16

      affine.store %1, %alloc[] : memref<i16>

    }

    %expand_shape = memref.expand_shape %alloc [] output_shape [1] : memref<i16> into memref<1xi16>

    %cast = memref.cast %expand_shape : memref<1xi16> to memref<*xi16>

    call @printMemrefI16(%cast) : (memref<*xi16>) -> ()

    return

 }

}

```

### 2. **Command to Run Without `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**

```

/path/llvm-project/build/bin/mlir-opt test.mlir -lower-affine -gpu-lower-to-nvvm-pipeline | \

/path/llvm-project/build/bin/mlir-runner -e main -entry-point-result=void \

-shared-libs=/path/llvm-project/build/lib/libmlir_runner_utils.so \

-shared-libs=/path/llvm-project/build/lib/libmlir_c_runner_utils.so \

-shared-libs=/path/llvm-project/build/lib/libmlir_cuda_runtime.so

```

### 3. **Output Without `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**

```

[-4]

```

### 4. **Command to Run With `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**

```

/path/llvm-project/build/bin/mlir-opt test.mlir -pass-pipeline="builtin.module(func.func(convert-affine-for-to-gpu{gpu-block-dims=1 gpu-thread-dims=0}))" | \

/path/llvm-project/build/bin/mlir-opt -lower-affine -gpu-lower-to-nvvm-pipeline | \

/path/llvm-project/build/bin/mlir-runner -e main -entry-point-result=void \

-shared-libs=/path/llvm-project/build/lib/libmlir_runner_utils.so \

-shared-libs=/path/llvm-project/build/lib/libmlir_c_runner_utils.so \

-shared-libs=/path/llvm-project/build/lib/libmlir_cuda_runtime.so

```

### 5. **Output With `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**

```

[-1]

```

I'm not sure if there is any bug in my program or if the wrong usage of the above passes caused this result.

</pre>

<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzsV01v4zYT_jX0ZSCBoj4sHXyw4zdAgF28xfbQo0FJI5mtRAr8cJJ_X5CSHafOZptiu2iBAoI-yOHzzAxnRhxujOgl4obkO5LvV9zZo9KbT3yslWbZqlbt8-bzM_TCwgm1EUqCMEDyXZLlnBcFyfeElUdrJ0PSLWH3hN33wh5dHTdqJOx-GE7nRzRp9Ss2lrD7Ro2j8C8LTFeUVcOyhjYd67quraq85mVarMum4zxNKGFVTOjWXywlLIU9mkaLyQolPTHdPgAfAZ8m1AJlI2QPXIKQjZJGGIvSgkbjBguPR5SAT9g466XsEcHwEeHzp4cvMGnVaz7Co7BH4LINL8rZIEYKGjVKnlDbiHedkBh1SkdWRf3kyHrXTy6qB9X8FrViNCTdU_BD9qiRt-exhKz3pKCvrfnZ4mTAKviCk1ata3A26iLhhZIYCNsStg2a_rRoSlhp0dh4HIQmrJolLqtfpi54BV0uuh1V6wYEst4RugUYcdTYxf2gaj4AYWzS4sQtEsbA-9FyaYFk9HA4fx2yJ5EUQNLtspikd2GIpP8Dku6hRWmQpHdREkbWOz6IXo5-O_x0kYW1osi8V7wOnZNN7G-wkHvCSQtpPweCh6QgrLyQEbZd6AirgFurRe0sGk_lYy7GUdhDcxDSou54g7c8JKMjF5Kw0kOcfQFAWN5EyWG2bw9cC3uML26IklnzpLgSp1-TprfCfBhUE2QXt4eBFy3ecdRrfy_mz8BzWMbGKo0vKhF2d6GcM_3bEJ3SYY3uadCA-vDMrvwTrJjnliWD4u2f5wnrkytv8bYVAXNR98X71667tTH5kHnnAAj0-DRx2R7MkU94vRevxy-bNWMrZydnz5P5LvkKIQhp1ctg8vRaD5Y33Nhr1vn7RqvtLQZcA1_lwAzd8GF4M20WzhBh6Ra-nkhRyNYQiwumRuu09B-zA5f7VTF5VavYuVbdqXH0ddTXNifhl6Wcfs9SGn484bopcITdT9web39AtRND658-8e99gYzUZOFSLiEa1CPqRTfwWi0jVkXy5LHEhIOfI-s7IPndB9m0kxI1RAi--ECE0urnaFJC2mj-UZF0f1KiXbAjc-Qa22gQtTf9m1SDqOe7ZzvMbAdnxWBio74LZvP3oLqWe2ArRoyNei_G0nOM_T8k5A-PrXwXZf78846K2Ttp8C_IgYkbcwn0sJfML7JCxvPRgbDy8iMlrPyQNckb1lBvjT_EVP7M8dcyy9vwX_b-47M3fyN7f2TqJm-m7gNh6xGksmCcRhCdP_j7FwNcPkPtehASxudLo6D0IgSPWskenOE9gpqHeK1OCD6L0EDDncEW7FGYpRWJV-0mbau04ivcJOusWK_XVVatjhue1m3XJRVP07rLS075mpU8q8uyKtiaNyuxYZTlNKcVLWmeVTFLixTbtKZ51aZdUpGM4sjFEIdTsNL9ShjjcJOkFWPJauA1Dia0fYzNrQPzDaDehH2uXW9IRgdhrHlBsMIOoVX0DYg_9Dxct1fzseiP7dX7XdX7-13QldPD5sPdZbDU-O5yNva0Yb8HAAD__8nKlWw">