[all-commits] [llvm/llvm-project] 34f9dd: [AArch64][SVE] Fold ADD+CNTB to INCB/DECB (#118280)
Ricardo Jesus via All-commits
all-commits at lists.llvm.org
Thu Apr 17 00:41:40 PDT 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 34f9ddf1ce2d775d8df1ca9c9806710f3ba8361d
https://github.com/llvm/llvm-project/commit/34f9ddf1ce2d775d8df1ca9c9806710f3ba8361d
Author: Ricardo Jesus <rjj at nvidia.com>
Date: 2025-04-17 (Thu, 17 Apr 2025)
Changed paths:
M llvm/lib/Target/AArch64/AArch64Features.td
M llvm/lib/Target/AArch64/AArch64InstrInfo.td
M llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
M llvm/test/CodeGen/AArch64/sme-framelower-use-bp.ll
M llvm/test/CodeGen/AArch64/sve-lsrchain.ll
M llvm/test/CodeGen/AArch64/sve-vl-arith.ll
M llvm/test/CodeGen/AArch64/sve2p1-intrinsics-ld1-single.ll
M llvm/test/CodeGen/AArch64/sve2p1-intrinsics-st1-single.ll
M llvm/test/Transforms/LoopStrengthReduce/AArch64/vscale-fixups.ll
Log Message:
-----------
[AArch64][SVE] Fold ADD+CNTB to INCB/DECB (#118280)
Currently, given:
```cpp
uint64_t incb(uint64_t x) {
return x+svcntb();
}
```
LLVM generates:
```gas
incb:
addvl x0, x0, #1
ret
```
Which is equivalent to:
```gas
incb:
incb x0
ret
```
However, on microarchitectures like the Neoverse V2 and Neoverse V3,
the second form (with INCB) can have significantly better latency and
throughput (according to their SWOG). On the Neoverse V2, for example,
ADDVL has a latency and throughput of 2, whereas some forms of INCB
have a latency of 1 and a throughput of 4. The same applies to DECB.
This patch adds patterns to prefer the cheaper INCB/DECB forms over
ADDVL where applicable.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list