[llvm] [AArch64] Improve lowering of scalar abs(sub(a, b)). (PR #151180)
Ricardo Jesus via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 30 04:00:18 PDT 2025
rj-jesus wrote:
I did some digging and, with this patch, we should generate the combined extended forms when compiling for code size. For example, [given](https://godbolt.org/z/Mfa3986W9):
```c
#include <stdint.h>
typedef uint8_t u8;
typedef int8_t i8;
u8 src_u8(u8 a, u8 b) {
return (a > b) ? a - b : b - a;
}
i8 src_i8(i8 a, i8 b) {
return (a > b) ? a - b : b - a;
}
```
With `-O2`:
```
src_u8:
and w8, w1, #0xff
and w9, w0, #0xff
subs w8, w9, w8
cneg w0, w8, mi
ret
src_i8:
sxtb w8, w1
sxtb w9, w0
subs w8, w9, w8
cneg w0, w8, mi
ret
```
With `-Os`:
```
src_u8:
and w8, w0, #0xff
subs w8, w8, w1, uxtb
cneg w0, w8, mi
ret
src_i8:
sxtb w8, w0
subs w8, w8, w1, sxtb
cneg w0, w8, mi
ret
```
Perhaps the restrictions in `isWorthFoldingALU` could be loosened, possibly allowing multiple SUB/SUBS users if those could be CSE'd? As mentioned previously, despite the code size reduction, I'm not sure this would be a win for performance currently. Either way, this patch still seems applicable to allow the SUBS extended patterns above to be matched.
https://github.com/llvm/llvm-project/pull/151180
More information about the llvm-commits
mailing list