<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/115493>115493</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[InstCombine] folding add and compare can negatively effect codegen for loops.
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
SpencerAbson
</td>
</tr>
</table>
<pre>
The following transform,
`icmp sgt/slt (add nsw X, C2), C --> icmp sgt/slt X, (C - C2)`
which was introduced in https://github.com/llvm/llvm-project/commit/45b7e69fefae98ad3c747fd090e6aafcf2cba94f, can have a negative effect on codegen where the icmp serves as the exiting condition of a loop.
Please see https://godbolt.org/z/9hMWo6b58
For an ARM target, we would expect:
```
.test
cmp w2, #1
b.lt .LBB0_2
.LBB0_1:
ldrb w8, [x0], #1
subs w2, w2, #1
strb w8, [x1], #1
b.hi .LBB0_1
.LBB0_2:
ret
```
But instead, we have:
```
.test
cmp w2, #1
b.lt .LBB0_3
add w8, w2, #1
.LBB0_2:
ldrb w9, [x0], #1
sub w8, w8, #1
cmp w8, #1
strb w9, [x1], #1
b.hi .LBB0_2
.LBB0_3:
ret
```
In this case, two things have poorly effected codegen
- The the possibility for a target to combine the add and compare with zero into a single CC-writing operation is sacrificed in favor of removing a use of the add, even if this add has other users.
- The post-inc exiting condition is transformed into a pre-inc exiting condition. LSR has the ability to rectify this, but it's use of SCEVExpander can only do so by inserting an add into the pre-header.
A similar issue (https://github.com/llvm/llvm-project/issues/54558) was fixed by https://github.com/llvm/llvm-project/commit/5f8c2b884d4288617138114ebd2fdb235452c8ce.
I'm considering a few options to fix this, and I'd really appreciate some feedback on what would be preferred.
- Implementing a similar restriction as in https://github.com/llvm/llvm-project/commit/5f8c2b884d4288617138114ebd2fdb235452c8ce.
- Allowing IndVarSimplify to perform LFTR on loops with a stride of -1, which would convert the terminating condition into an equality and allow LSR to treat this as an `ICmpZero` (see line 3554 of LoopStrengthReduce.cpp).
Thanks
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy8Vs1u4zYQfhr6MrAgUZItHXxInDUQIAsUyWJb9FJQ5MhilyJVkraSffqClOw4bgJsF2iNQNHPcP6_b4Y5J_cacUPKW1LeLdjBd8ZungbUHO1N44xeNEa8bL50CK1RyoxS78Fbpl1rbE_olqR3JL2Zr6tU8n4At_eE7pzyQGjFhADtRviN0C1sKaF1vIHlkuSf4Eo-ChFabWE5y67SSwtjJ3kHI3MgtbdGHDgKkBo67wdH8htCd4Tu9tJ3hybhpid0p9Tx9G85WPMn8mCMm76X4aYomzWu6hZbhnXFRM7XxboVaZ3iirGWt5Q3rC7a4BhnGjp2RGCgcc-8PCJg2yL3YDRwI3CPGsYOLYLvcA4O7REdMBdf4bP0IYXcaCG9NBpMCwyUMUNyGegvCplDcIjXsRnRGOUTY_eE7r4Tuqu7z7-aVVNWl-d3xgLTcPP4GTyze_TB_xFhNAclAJ-HkIb8tW7zX3xMPDo_3cL8C4GE30inAuXZ2-9NojwkD7e36R90VhIfsrONk6QStomaqqipvH1OSXn3vlZ3aNyr1Y9sO3-tMftQY5N0Ek6uXfhJ_-GnRf9uck7fp6fbgwepnUcm5gSH_vhvEpu__RJw9Rr29fmPwjqnv_6h9F9aqN6XOkdw_f1cl_oH6wLwXgvlrzF8VJPL670G30kHnDkMtvxowgu9dxN0B2Oseplhi-KE2ksVSyBpHRgvAHYwzslGKulfoA2gmvEE3gA3fSP1JBfKwXTQ1w_MIozSd_AdrQlMZYCBk3qvELbb5WgnDjADWhY5QDpwjFvZypnPWnY0NlCDxd4cgzCDg8PwZjYWYsMjapDtFHBwoGMOjO_QBmHrkqt4BuP8Umr-DgtJ98rq0YXo82DxffkEHp4eo7nozpwfb8Ai97J9iS4FF5sAEE_o2p38f9p--vrpeWBaoI2MarR6AWHAGWheAprQRmNMx5iiK7ESFpcdMoH2DVPegJO9VMyCdO6AYXr862kQTzpCd2VRlhWhdRwxrXxGEXz6-elSthWnTVUVoqBVtcrWWV5lWYGNoK1oaF4WJeUVxzcR3RO67kOunRRop-K3OIIZQupdSHMrn88pDl0XjgiwyJR6ATYMFrlkHsGZHqFFFA3j38KQGjvm5yHQxJS2aC2K5G3_3_eDwh71VIdzgi06byWPDRNH8P-XmSXcnLaPey2-Mvsk-0HFTjMwoA19Cw-7L48hyDBN3YRAFmhIith5yyzy2LRCxBRwo49ofWwvj7aXml3jIgJBA_51YLHHQ7ZZcCUiILSmReZnCLogSlbp_bYffkdryCoN_RimuApEkZdlETx5MGZ48hb13nePGLaYhA8DofWbOnzpmP7mpvuF2OSizmu2wE22zrN0vV5Ruug2NGtaUa_TmpdVkTUt0lWOtF6tRVXXWdUu5IamtMiytMqKMi-ypBZ1U6VY1SUvWsozUqTYM6mSUKqwVywiHDZZVhZ1vlCsQeXihkipxnGCGaE0LIx2E-vbHPaOFKmSzrtXNV56FVfLe-38duJKUt6FRVLExrqizEAGp73qTNHntSqQbyxssjhYtfl5jM9xHTf07wAAAP__7DdR0g">