<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/94520>94520</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[mlir] Incorrect region simplification in `applyPatternsAndFoldGreedily`
</td>
</tr>
<tr>
<th>Labels</th>
<td>
mlir
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
bmhowe23
</td>
</tr>
</table>
<pre>
## Summary
Region simplification appears to be performing incorrect simplification under some nested loop cases, causing erroneous code generation. I have seen this incorrect behavior in 16.0.6 and 19.0.0 (on Ubuntu 22.04, x86_64). I have not checked any other versions.
## Details
The following MLIR causes incorrect region simplification in any pass that calls `applyPatternsAndFoldGreedily()` with `config.enableRegionSimplification` set to true (which is the default parameter value).
**nested_loop_bug.mlir**:
```mlir
module attributes {llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", llvm.triple = "x86_64-unknown-linux-gnu"} {
func.func @nested_loop() -> i32 attributes {no_this} {
%c3_i64 = arith.constant 3 : i64
%c2_i64 = arith.constant 2 : i64
%c0_i64 = arith.constant 0 : i64
%c1_i64 = arith.constant 1 : i64
%c1_i32 = arith.constant 1 : i32
%c0_i32 = arith.constant 0 : i32
cf.br ^bb1(%c0_i32, %c0_i64 : i32, i64)
^bb1(%0: i32, %1: i64): // 2 preds: ^bb0, ^bb8
%2 = arith.cmpi ult, %1, %c2_i64 : i64
cf.cond_br %2, ^bb2(%0, %1 : i32, i64), ^bb9(%0, %1 : i32, i64)
^bb2(%3: i32, %4: i64): // pred: ^bb1
%5 = arith.addi %4, %c1_i64 : i64
cf.br ^bb3(%3, %5 : i32, i64)
^bb3(%6: i32, %7: i64): // 2 preds: ^bb2, ^bb5
%8 = arith.cmpi ult, %7, %c3_i64 : i64
cf.cond_br %8, ^bb4(%6, %7 : i32, i64), ^bb6(%6, %7 : i32, i64)
^bb4(%9: i32, %10: i64): // pred: ^bb3
%11 = arith.addi %9, %c1_i32 : i32
cf.br ^bb5(%11, %10 : i32, i64)
^bb5(%12: i32, %13: i64): // pred: ^bb4
%14 = arith.addi %13, %c1_i64 : i64
cf.br ^bb3(%12, %14 : i32, i64)
^bb6(%15: i32, %16: i64): // pred: ^bb3
cf.br ^bb7
^bb7: // pred: ^bb6
cf.br ^bb8(%15, %4 : i32, i64)
^bb8(%17: i32, %18: i64): // pred: ^bb7
%19 = arith.addi %18, %c1_i64 : i64
cf.br ^bb1(%17, %19 : i32, i64)
^bb9(%20: i32, %21: i64): // pred: ^bb1
cf.br ^bb10
^bb10: // pred: ^bb9
return %20 : i32
}
}
```
The above MLIR roughly corresponds to this C++ code (after running through some of our custom passes which are not relevant for the purposes of this issue:
```cpp
int nested_loop() {
int nEle = 0;
for (std::size_t r = 0; r < 2; r++) // r = 0,1
for (std::size_t s = r + 1; s < 3; s++) // s = 1,2 then 2
nEle += 1; // should be incremented 3 times
return nEle;
}
```
The MLIR code can be executed without any passes using small test driver program, and it produces the correct result (3).
**nested_loop_driver.c**:
```cpp
#include <stdio.h>
int nested_loop();
int main(int argc, char *argv[]) {
printf("nested_loop: %d\n", nested_loop());
return 0;
}
```
### Results when using no additional passes
```bash
$ cat nested_loop_bug.mlir | mlir-opt-19 --convert-func-to-llvm --convert-cf-to-llvm | mlir-translate-19 --mlir-to-llvmir | llc-19 --relocation-model=pic --filetype=obj -o nested_loop_good.o
$ clang-19 nested_loop_driver.c nested_loop_good.o -o nested_loop_good.exe
$ ./nested_loop_good.exe
nested_loop returns: 3
```
In order to run `applyPatternsAndFoldGreedily` from the command line, we can simply use `--test-math-algebraic-simplification`. The problem is not unique to that pass, nor even really related to that pass because this MLIR will not match any patterns used by that pass. I'm merely using it to show how `applyPatternsAndFoldGreedily` is behaving.
### Results when using `applyPatternsAndFoldGreedily`
```bash
$ cat nested_loop_bug.mlir | mlir-opt-19 --test-math-algebraic-simplification --convert-func-to-llvm --convert-cf-to-llvm | mlir-translate-19 --mlir-to-llvmir | llc-19 --relocation-model=pic --filetype=obj -o nested_loop_bad.o
$ clang-19 nested_loop_driver.c nested_loop_bad.o -o nested_loop_bad.exe
$ ./nested_loop_bad.exe
nested_loop returns: 2
```
**The fact that is now returning 2 is incorrect operation.**
The following MLIR snippet shows the output of applyPatternsAndFoldGreedily (via --test-math-algebraic-simplification).
```mlir
$ cat nested_loop_bug.mlir | mlir-opt-19 --test-math-algebraic-simplification
module attributes {llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", llvm.triple = "x86_64-unknown-linux-gnu"} {
func.func @nested_loop() -> i32 attributes {no_this} {
%c3_i64 = arith.constant 3 : i64
%c2_i64 = arith.constant 2 : i64
%c0_i64 = arith.constant 0 : i64
%c1_i64 = arith.constant 1 : i64
%c1_i32 = arith.constant 1 : i32
%c0_i32 = arith.constant 0 : i32
cf.br ^bb1(%c0_i32, %c0_i64 : i32, i64)
^bb1(%0: i32, %1: i64): // 2 preds: ^bb0, ^bb7
%2 = arith.cmpi ult, %1, %c2_i64 : i64
cf.cond_br %2, ^bb2(%0, %1 : i32, i64), ^bb8(%0 : i32)
^bb2(%3: i32, %4: i64): // 2 preds: ^bb1, ^bb4
%5 = arith.addi %4, %c1_i64 : i64
cf.br ^bb3(%3, %5 : i32, i64)
^bb3(%6: i32, %7: i64): // pred: ^bb2
%8 = arith.cmpi ult, %7, %c3_i64 : i64
cf.cond_br %8, ^bb4(%6, %7 : i32, i64), ^bb5(%6 : i32)
^bb4(%9: i32, %10: i64): // pred: ^bb3
%11 = arith.addi %9, %c1_i32 : i32
cf.br ^bb2(%11, %10 : i32, i64)
^bb5(%12: i32): // pred: ^bb3
cf.br ^bb6
^bb6: // pred: ^bb5
cf.br ^bb7(%12, %4 : i32, i64)
^bb7(%13: i32, %14: i64): // pred: ^bb6
%15 = arith.addi %14, %c1_i64 : i64
cf.br ^bb1(%13, %15 : i32, i64)
^bb8(%16: i32): // pred: ^bb1
cf.br ^bb9
^bb9: // pred: ^bb8
return %16 : i32
}
}
```
**Diff** (original on the left, applyPatternsAndFoldGreedily on the right)
![image](https://github.com/llvm/llvm-project/assets/141149032/70cbec3b-4c65-4d86-b0a1-062ffe0f8940)
It appears that `applyPatternsAndFoldGreedily` incorrectly merged `^bb5` and `^bb6` and correspondingly eliminated the `%14 = arith.addi %13, %c1_i64 : i64`. As demonstrated with the test program, this results in incorrect behavior.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsGl1v47jx1zAvhAyJsmT5IQ_52BQLtEBx2z4HlDSyeEeRKkkl6_76YkjJlrx27PTuYVvcweelqZnhfHNmFG6t2CmAe5I9kuz5jg-u1ea-7Fr9Diy9K3W9vycsJSyl34au42ZP4mcSP4TvX2AntKJWdL0Ujai4w5-874EbS52mJdAeTKNNJ9SOClVpY6BypxiDqsFQqzugCqyDmkqte1pxC5awJ1rxwSIBMEYr0IOlla6B7kCB8RRW9Ctt-RtQC6Coa4WdHVZCy9-ENlQomuSreJVTrmqabFfxKqaEFVrRf5aDcgNlbBWv8cTvRf6arwnbHigr7WjVQvUb1JSrPdWuBUPfwFihlV3N1TIq7BkcF9LOn_yjBdpoKfU7ivO3v379xcsGc3bNWaUK5U_tubXUtdzRiktpKclj3vdy_3fuHBhlH1T9omX9FwNQC7knrCBsS_KYvgvXInSlVSN2K1C8lBDs921xEgJbcGg9ZwZA_by3omqpwIOB1tDwQTrac8M7cKgDLgdAVS11gJ9gzVe05ms57FadFCY8IekEmMfh45_5rU7XgwTKnTOiHBxYSjaPUr51q5o7_ir5Xg-OkvSZEsYg6kj6AFHPNjFJH1Lmv_BnsvyJq3ztvyIxLZoCkRJWRKrART7h5OvoW4LaY-gO_nBnRC9hOjd4SDSo35R-V5EUavge7dSAGJtn5DgIQ2kzqGqFX5Ss45lGgnFoRNIvVKTsRF6lX9GPT2hRSlhWpa8iX3tGuBGuXVVaWceVoykl6QNF6QI8ArNLwOwEeCQeX4KPzxBPLgEn54knryjpZfiU_cDMefj4R_iqWZWGkuxLWSZetyM6GnAuWEBjT547tp0IzBHjGRRhWTLJwra4ooS9EPZCGe0N1Ba3PHLswXFVHFW04L7rBR2kO5AdOWNHzuYKqxqUt35FqVjGDtTZxORI5oxME-j2Ouhc_JFyuhR_fV58FP4ge7KwWzYTmte1CFRGaZOL0k7mSyc2AkZ2hesRPF9yvbnNaEe1ZgsRist220ySpLfYrTgcsJ74HMl8YLf8OuhcAyPl7YnbxjcYLl1InSRnLLedWS5lHwZeFhhJkgMLVzifENgJ6-kNrC9zS7I-w3qSftrrkgMPl5PF3EZJdsJ6_jmtzzjYLFSzuYidn8UuDvyMYXtF9RP85oT_4gb-N0vVb8-pvrhd9cmBlZGH7RXex7zGThI1u5Cpz6aq-fnx8haILyJvj_wbcINR_tjT24hsnsfq5rCYypzTcpCX-g1CKWj0sGvlnvpK0PZa1b6I9uXsE2GPhD2GwpewgjdYeplBKawkXetxQwmtG6oHQ6vBOt35ihEsDTUcN6GQNSDhDa_RRhtf1fWD6TXC6WYsn60d4Mcirer7sCOUoz_WMrNKxQN8GeulmKTHckhjWiysQ6WS9MGKf8Oro-YA6JdPlPllEBtpH_4bzTIhsKfZ3XOBuPWw-OiRJkjX-iNSv1wcMRIPCJjFGOpH0Vm6o6Nc7DHApI8HrFYPssauR6jKQAcKe5mUOtGBnQiMboM0Dlq57iahV0DjV1zhCfAdqgHJY2WP9fDUHoCloVuyHZeSOrCO1ka8gaG90TvDOwwW7ICEw516qCCU9scOxGKFT1iRXi_rA-lVdamsP3gMYalQlRzQf9Mn62qhVy1Jv3zgTwf94NOOC0VYgUtudpXvC1uONn3gZvcWOtilE_ZGKNd4WmxOPPWVWU2yJzUW-D8ePTv9YLP4RoOFBhB7wF-8KjH8QI1mUZpiisRei8vRYie0Sm7bidKaVnyhmkMjRcnmieIi0r2Lki2NokqrNzAuwm4jcjrCvmW2XTWHzQOuM1xZyR0ECmEvAI1HSFmFZwakDk1i1OkaJEmfe1HRKGqEBLfvgaTPuvyVRnrB707reqVn8kiudkjxnBudwTxLD77DkeKKsJfLELMnoyF99Zd-YMCvimpTg8H0awZ1tc3OY9oY3Y1R1HUYXVIoQNd6DyHr-_k9HSwgtSjCuIw67tqIyx2Uhosqsqed-Ipi7PdGlxI67MAxeQ9K_GuAcDNw5z3Iu7A2FN5AUQNcyj3meI75YQ5HS_DzhpDjfU55F1J6sh13eEP4LBKkRGZrWu6P-Cv6lbBNRzsw4IXxYx0_LLCtfqf4_w26EnYcyqjdmcHJhbi5TvePiqLrpvlZQ63k_2WkecRz1D6MswXAhTBjH-ZJ_PihGK9ccDPv5O8jBbQ7o4tpnu6nmd9443w8XrNK9D04757hltOD6weHhc5H7oS335vgN3mDHxNenGb94c7355DszyHZ_8OQbPOzDMnGRng2qvgdU7FTeZPZ-Od_ZES2aHjZTzQVG4dF-SVD_TRjMPb7x2CfnR3lC3r5RezswuRpOfu6lgIm-JOYSG4ZFedLVZ8LhOQTkZAcWBl5uBwLi7FXfoOqk7Mnno6iLmEX52ZFSX7qOTcOi0K58yyaJqz8m0sjdgLbSK18bSOh8eH4YWUzwhqxa91BN4QlJHsUHd-Bb6OL1rkes1gQbCdcO5SrSneEveBFP_4T9Ub_CpUj7AUbWWcJe0nWSbLexqjZl01clVClZbSu8ixa10UelTFPojhnTQNxU2zX8ZGH0Hu54xtkLAhv6Sem6lDusTXZQY1Yo7vnsR92TBv5tHGcsgm1k3sKUnRChZap9U3aZ4fK2K89WFpDhxew4dN0xtPzc5jZAMZ3YGZsc4Q687560Rrd1fdpvU23_A7uk02yyZO02CR37f0mqZOSbZuqTqo0Ljebqo6TnNdpXGbrOmN34p7FbB3ncZYU2SZLVtDAOi-yutwU1bpscrKOoeNCrnwBp83uzg__7rfrjMV3kpcgrf8DAcbC-1tGsuc7c-_tXw47S9axFNbZIwEnnPR_VOARsmf69frr7et2vhuMvP-0X3ph0C-DPG_37D8BAAD__89iN6w">