[llvm] [AArch64] Improve lowering of scalar abs(sub(a, b)). (PR #151180)

Ricardo Jesus via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 30 04:00:18 PDT 2025


rj-jesus wrote:

I did some digging and, with this patch, we should generate the combined extended forms when compiling for code size. For example, [given](https://godbolt.org/z/Mfa3986W9):
```c
#include <stdint.h>

typedef uint8_t u8;
typedef int8_t i8;

u8 src_u8(u8 a, u8 b) {
  return (a > b) ? a - b : b - a;
}

i8 src_i8(i8 a, i8 b) {
  return (a > b) ? a - b : b - a;
}
```
With `-O2`:
```
src_u8:
	and	w8, w1, #0xff
	and	w9, w0, #0xff
	subs	w8, w9, w8
	cneg	w0, w8, mi
	ret

src_i8:
	sxtb	w8, w1
	sxtb	w9, w0
	subs	w8, w9, w8
	cneg	w0, w8, mi
	ret
```
With `-Os`:
```
src_u8:
	and	w8, w0, #0xff
	subs	w8, w8, w1, uxtb
	cneg	w0, w8, mi
	ret

src_i8:
	sxtb	w8, w0
	subs	w8, w8, w1, sxtb
	cneg	w0, w8, mi
	ret
```

Perhaps the restrictions in `isWorthFoldingALU` could be loosened, possibly allowing multiple SUB/SUBS users if those could be CSE'd? As mentioned previously, despite the code size reduction, I'm not sure this would be a win for performance currently. Either way, this patch still seems applicable to allow the SUBS extended patterns above to be matched.

https://github.com/llvm/llvm-project/pull/151180


More information about the llvm-commits mailing list