<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/119606>119606</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[AArch64] Eliminate `cmp` by reassocating `add` and `sub`
</td>
</tr>
<tr>
<th>Labels</th>
<td>
backend:AArch64,
missed-optimization
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Kmeakin
</td>
</tr>
</table>
<pre>
This rust fn:
https://godbolt.org/z/5EKvWeEYb
```rust
#[no_mangle]
pub fn last(xs:&[u8])->Option<&u8>{
xs.last()
}
```
produces this assembly:
```asm
last:
add x8, x1, x0
cmp x1, #0
sub x8, x8, #1
csel x0, xzr, x8, eq
ret
```
Equivalent C also produces the same assembly
https://godbolt.org/z/c5dMbv63K
```c
#include <stddef.h>
#include <stdint.h>
uint8_t* last(uint8_t* x0, size_t x1) {
if (x1 == 0) {
return NULL;
} else {
return x0 + x1 - 1;
}
}
```
By reassocating `sub` and `add`, we can save an instruction by reusing the overflow flag from the first `sub`:
```asm
tgt: // @tgt
subs x8, x1, #1
add x8, x0, x8
csel x0, xzr, x8, lo
ret
```
I could not convince clang to produce this assembly, even using `__builtin_sub_overflow`:
https://godbolt.org/z/6vPWjx1zv
```c
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
uint8_t* tgt(uint8_t* x0, size_t x1) {
size_t sum;
bool overflow = __builtin_sub_overflow(x1, 1, &sum);
if (overflow) {
return NULL;
} else {
return x0 + sum;
}
}
```
I could only produce this assembly by writing the LLVM IR manually:
https://godbolt.org/z/hvjEEsG9r
```LLVM
define noundef i64 @tgt(i64 noundef %0, i64 noundef %1) {
%4 = tail call { i64, i1 } @llvm.usub.with.overflow.i64(i64 %1, i64 1)
%overflow = extractvalue { i64, i1 } %4, 1
%diff = extractvalue {i64, i1} %4, 0
%sum = add i64 %0, %diff
%ret = select i1 %overflow, i64 0, i64 %sum
ret i64 %ret
}
```
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJysVt1u4zYTfRr6ZhCDoizZuvCFfz8skv1aFG0XvTJIirKZpUivSCpOnr4gJcV24jSLooZhQ5yZM5w5Z0RSa-VeCzFH2RJl6xH17mCa-X0t6HepR8yUz_PfD9JC462DSqN0gXD4Hpw72vBEtohs96ZkRrmxafaIbF8Q2Wab-_ab2PzFgnuOu28ACc8kRdlSm11N9V4JlK0RXhw9g0qDotYhMjt12DnKln4WHEhxh9LNL0cnjUbpCpHcz1C6QdMlwgsAgJMd97GIFCHJdH2Zutv1sTGl58KCCzVRa0XN1HNf1OBKbY3wIoJFA_QfWpbx_zRDZAWnJP7iCwdeHzuHaEIkvTRazy6jZ71LchlvhYouOLq8NGdP8SP4NcK9r2nzw8uWKqEdrIAqa-CiSgGW1uJc6efE8az8yto8vb_MxDvWpObKlwJQurKuLEU1PgQObtikdq82vPBSu9nOIbIY-L1Y6aq18kXsXGxdAT2rsoIghQRQukbpGvCFrf80wvlGw___eHhA6WBA0zUIZcVt3xMGRJZwSuAOkqugj1SzfIZGUGsNp07qPaAcW89QjoHqMjzRsgzOZAVPAjjVYGkrgGqQ2rrG8yBaYAHF2wAQeDGtaCplnqBSdA9VY-q4XMnGunOGm9J0-6BM6BgENMFh4Upp9q1OB6W90TDuFfZTGlTmupnvG_UFuPGqBG0ccKNbqbkArmgo-VWWb2YviLsVGrrOoBzvdsxL5aTeWc92Q5tee_GZfvP212-Pp-Sl_VS_zBj1oYD_lbgDDz-p7cBTt2x9fVZh2NNZGkH0H7QjzEXA7tnNAwopOqAeqxufc8R_NztXW_6HwRn0YLR6vk1_mIqnRrphKh4e_vwKX36DmmpP1fBq_ozzQ_u42dj_Fc3lFgIUwotSVFIL0MbrUlQg88kwMWQWHgYDIlkk683aFWeIZJNIiqNSAadKBVMIiZFJ7B6aYKXaeuytZ-Mn6Q7jgYJxdIxZO-QuW9IdWBH-inpxcg3lrqXKi1uJSDbpJNAHl7Kqbga-xl2G4SHM-jpGhXdDvzfcqSoiDm6NcNHNCiW4i5s473eo5bWFA24IDpH9Wv_auBbLqJynZZEWdCTmyTRNi0mWzNLRYc4pTlhV4ITSnExxQdlkxmaclpxRNp2kIzknmEwSkiRJQXKSjquCV3mFyxkv0lzkGE2wqKlU40iJafYjaa0X8yQpcpyPFGVC2Xj5IYRR_l3oEqWLxaLhh9AzgsgKEVJLa0V5Z45O1vKFxjsIIeG-1MwD8B3zext4l9bZcyonnYo3qwEvW8NGyVpq6kR41fH6GE4R9v586U6U4XzpzoKRb9T8zShId_BszE2NyDbk7f_ujo15FNwhso31WkS2fcntnPwdAAD__1ia14k">