<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/118633>118633</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [mlir] Inconsistent output when executing MLIR program with and without --linalg-fuse-elementwise-ops optimizations
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            mlir
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Emilyaxe
      </td>
    </tr>
</table>

<pre>
    git version: 0a44b24

system: `Ubuntu 18.04.6 LTS`

## Description:
I am experiencing an inconsistent result when executing the same MLIR program with and without the `--linalg-fuse-elementwise-ops` optimization. 

## Steps to Reproduce:


### 1. **MLIR Program (a.mlir)**:
a.mlir: 
``` 
module {
 func.func private @printMemrefI32(tensor<*xi32>)
  func.func private @printMemrefF32(tensor<*xf32>)
  func.func @main() {
    %0 = "tosa.const"() <{value = dense<18> : tensor<1x1x1xi32>}> : () -> tensor<1x1x1xi32>
    %5 = "tosa.const"() <{value = dense<40> : tensor<1x1x1xi32>}> : () -> tensor<1x1x1xi32>
    %6 = "tosa.const"() <{value = dense<19> : tensor<1x1x1xi32>}> : () -> tensor<1x1x1xi32>
    %8 = "tosa.const"() <{value = dense<16> : tensor<1x1x1xi32>}> : () -> tensor<1x1x1xi32>
    %9 = "tosa.const"() <{value = dense<63> : tensor<1x1x1xi32>}> : () -> tensor<1x1x1xi32>
    %20 = tosa.arithmetic_right_shift %5, %5 {round = true} : (tensor<1x1x1xi32>, tensor<1x1x1xi32>) -> tensor<1x1x1xi32>
    %22 = tosa.bitwise_or %20, %0 : (tensor<1x1x1xi32>, tensor<1x1x1xi32>) -> tensor<1x1x1xi32>
    %28 = tosa.minimum %6, %22 : (tensor<1x1x1xi32>, tensor<1x1x1xi32>) -> tensor<1x1x1xi32>
    %37 = tosa.cast %20 : (tensor<1x1x1xi32>) -> tensor<1x1x1xf32>
    %46 = tosa.bitwise_xor %9, %28 : (tensor<1x1x1xi32>, tensor<1x1x1xi32>) -> tensor<1x1x1xi32>
    %74 = tosa.clamp %46 {max_fp = 6.553600e+04 : f32, max_int = 65536 : i64, min_fp = -6.553600e+04 : f32, min_int = -65536 : i64} : (tensor<1x1x1xi32>) -> tensor<1x1x1xi32>
    %75 = tosa.clamp %8 {max_fp = 6.553600e+04 : f32, max_int = 65536 : i64, min_fp = -6.553600e+04 : f32, min_int = -65536 : i64} : (tensor<1x1x1xi32>) -> tensor<1x1x1xi32>
    %76 = tosa.mul %74, %75 {shift = 0 : i8} : (tensor<1x1x1xi32>, tensor<1x1x1xi32>) -> tensor<1x1x1xi32>
    %89 = tosa.cast %76 : (tensor<1x1x1xi32>) -> tensor<1x1x1xf32>
    %90 = tosa.clamp %89 {max_fp = 6.553600e+04 : f32, max_int = 65536 : i64, min_fp = -6.553600e+04 : f32, min_int = -65536 : i64} : (tensor<1x1x1xf32>) -> tensor<1x1x1xf32>
    %91 = tosa.clamp %37 {max_fp = 6.553600e+04 : f32, max_int = 65536 : i64, min_fp = -6.553600e+04 : f32, min_int = -65536 : i64} : (tensor<1x1x1xf32>) -> tensor<1x1x1xf32>
    %92 = tosa.mul %90, %91 {shift = 0 : i8} : (tensor<1x1x1xf32>, tensor<1x1x1xf32>) -> tensor<1x1x1xf32>
    %96 = tosa.tanh %92 : (tensor<1x1x1xf32>) -> tensor<1x1x1xf32>
    %cast = tensor.cast %96 : tensor<1x1x1xf32> to tensor<*xf32>
    call @printMemrefF32(%cast) : (tensor<*xf32>) -> ()
 return
  }
}



``` 

 ### 2. **Command to Run Without Optimizations:**:

``` 
 /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt a.mlir -pass-pipeline="builtin.module(func.func(tosa-to-linalg))" | /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt  -tosa-to-arith -convert-arith-to-llvm  -test-math-polynomial-approximation -one-shot-bufferize="bufferize-function-boundaries"  -convert-math-to-llvm -finalize-memref-to-llvm   -convert-arith-to-llvm  -convert-linalg-to-loops -convert-scf-to-cf  -convert-arith-to-llvm  -convert-func-to-llvm -finalize-memref-to-llvm   -convert-func-to-llvm  -reconcile-unrealized-casts | /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-cpu-runner -e main -entry-point-result=void --shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_c_runner_utils.so --shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_runner_utils.so

``` 

### 3. **Output without optimization:**:

``` 
Unranked Memref base@ = 0x563181939d80 rank = 3 offset = 0 sizes = [1, 1, 1] strides = [1, 1, 1] data = 
[[[0]]]

``` 

### 4. **Command to Run With --linalg-fuse-elementwise-ops  Optimizations:**:


``` 
/data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt a.mlir -pass-pipeline="builtin.module(func.func(tosa-to-linalg))" | /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt --linalg-fuse-elementwise-ops   -tosa-to-arith  -convert-arith-to-llvm -test-math-polynomial-approximation -one-shot-bufferize="bufferize-function-boundaries"  -convert-math-to-llvm -finalize-memref-to-llvm   -convert-arith-to-llvm  -convert-linalg-to-loops -convert-scf-to-cf  -convert-arith-to-llvm  -convert-func-to-llvm -finalize-memref-to-llvm   -convert-func-to-llvm  -reconcile-unrealized-casts | /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-cpu-runner -e main -entry-point-result=void --shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_c_runner_utils.so --shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_runner_utils.so
``` 

### 5. **Output with optimization:**:

``` 
Unranked Memref base@ = 0x55e9d4859200 rank = 3 offset = 0 sizes = [1, 1, 1] strides = [1, 1, 1] data = 
[[[1]]]

``` 

</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWV9vqzoS_zTOi-XI2EDgIQ9pciJVuld3de9e7WNlYEi8CzayTU7aT7-yIX_apu1pT7M6OntURAMez_zGM_yYMcJauVEAc5TcoGQ1Eb3bajP_0srmXuxhUujqfr6RDu_AWKkV4gtMRRwXLEZ0gejC3lsHrb-NUvp30SvX4yib0nia4t_--RdK6SCHGEeM4xXY0sjODaoQXdxi0WLYd2AkqFKqDRYKS1VqZaV1oBw2YPvG4a9bUBj2UPbOS7ktYCtawL__dvsn7ozeGNHir9JtsVBV-KF7F8RQSglppBLNhtS9BQINtKDcV2mB6M6ilGLdOdnKB-GBTfEjyH856Cx2Gv8JndFVX8KA_FzIy0VTjNgCsUVA9I8REWKZmLaNNIjlw_Awe7zp140u_CqFw1-0uuobwGh2g-gC170qp_6EOyN3wgFGMe2MVO53aA3Ut5whljlQVhvEl4gt9pIzxL94e3SB31Swfq6gfkEBimkrpEIsQyw_AMQYI5ZQjPgKI8actmLqo-cQYwdJvkSzm51oeghiFSgLiC-jDPEv2K_B0Xy093-jB7PVYXxURPz1ZdkjkuQjSGJ6BSTph9YkvwKS7ENI0isgyT-CJOVXQMKGlA04hJFu24KT5Z2Rm627s1tZu5BNiC3HrJrdGN2raphlekCz1cHmZVNs-QKGb0XITggLGejqTpsB-wiLXhlBdkLQSiXbvg2JPVoPAK9ons9O5kth3TFsr9m8rLp-ojpOn6_tfljc_OBddl3vZvGZd41ouwOu2U0r9nd1F4bTaZLwlFJA7IbGAZH3hS2xF5LKDVJeKAzKNA6DUh00kFdUSHVUQR7reDO5v9HJ5IKT2U_m41kutX0zxHZMolkgjpFO-AoP2Suzq5NHlj9_dmbppzw7Ob0U0_xHD2r9PiejC056QvqpnGTPMjc_vFr8Arwvc-uXMvedoM4eJyfU9gj0-_0dHgSvPMgdH4w8vVRdDLN95X-pOh6UlqJpLhbUo7WhrHmM_FGJPeAe6hav04DrjQrafV1DF-P5eJy3Cl7s1ICwQwOy1G3rmyDfs_QK_2vshf4463Gs70LO2pGnmjFi60o4gdjaPtwjtvZNDWLrptm1xEADwsLhsjP631A6xNZFL5vK__dNwto3OER3Dg-9DiadsJZ0soNGKkB8hRjzM5xU06HpQSw7dht-xbQVxOmxdwsNVI580TFbfio-TA6WQiGISanVDowbLgOCZtd6MbCOtMJtSaebe6VbKRoius7ovWzDumKiFRC71Y4UfV2DkQ9HT8dL4r3zsqTw9aQwEqz36mQ2WDhYJbX33s9rQ3qd4LyM83B_bHr9gNadPQ3YMugp629Q4uG-C82jCZgYKLUqZQOkVwbC5Ir4Z8N-aiDLriemVwoMJoB9n4oJKGfuSaelcmTYRkB8tdOywoTYrTBQkUYWNgToO1A0shjOHshdeTfAuOudbOzU6msae2LqMkecKIIfKOKP3nW9O26TnG9_vMEMfysj1H-gwgPb4cKjjYeSgO6TlEdZlPO8yij2guE-x7quLRxeJFY-gB0aweQm8i-M8ZSssHVGVi-O-mUbhugi7Fj5g6JkNR5vuB-_wpD41T0i_CZ7PrX7f8Sfb6zcU3p9iXJ-0esvev3x6fVFckkucOsnEmsCeRVnSc7o_4xYo9eIdVLNeZXzXExgHs0451FOEzbZzkVZJgwKKniZVmWVFWlZpLyo0lqkVV5lEzlnlMURo3HEOU3YtAbgJefVLIuKImUUxRRaIZupj9FUm81EWtvDPIqylPNJIwpobPhiwdiwtc5QspqYeYhp0W8simkjrbMnDU66JnzlCBOSFb49_8agx5A9_sbw-qeF11nvPPB20ptmvnWuG14ca8TWG-m2fTEtdTvm4vOUDE5bxNaj37s5-28AAAD__4OYmEQ">