[all-commits] [llvm/llvm-project] 34f9dd: [AArch64][SVE] Fold ADD+CNTB to INCB/DECB (#118280)

Ricardo Jesus via All-commits all-commits at lists.llvm.org
Thu Apr 17 00:41:40 PDT 2025


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 34f9ddf1ce2d775d8df1ca9c9806710f3ba8361d
      https://github.com/llvm/llvm-project/commit/34f9ddf1ce2d775d8df1ca9c9806710f3ba8361d
  Author: Ricardo Jesus <rjj at nvidia.com>
  Date:   2025-04-17 (Thu, 17 Apr 2025)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64Features.td
    M llvm/lib/Target/AArch64/AArch64InstrInfo.td
    M llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
    M llvm/test/CodeGen/AArch64/sme-framelower-use-bp.ll
    M llvm/test/CodeGen/AArch64/sve-lsrchain.ll
    M llvm/test/CodeGen/AArch64/sve-vl-arith.ll
    M llvm/test/CodeGen/AArch64/sve2p1-intrinsics-ld1-single.ll
    M llvm/test/CodeGen/AArch64/sve2p1-intrinsics-st1-single.ll
    M llvm/test/Transforms/LoopStrengthReduce/AArch64/vscale-fixups.ll

  Log Message:
  -----------
  [AArch64][SVE] Fold ADD+CNTB to INCB/DECB (#118280)

Currently, given:
```cpp
uint64_t incb(uint64_t x) {
  return x+svcntb();
}
```
LLVM generates:
```gas
incb:
        addvl   x0, x0, #1
        ret
```
Which is equivalent to:
```gas
incb:
        incb    x0
        ret
```

However, on microarchitectures like the Neoverse V2 and Neoverse V3,
the second form (with INCB) can have significantly better latency and
throughput (according to their SWOG). On the Neoverse V2, for example,
ADDVL has a latency and throughput of 2, whereas some forms of INCB
have a latency of 1 and a throughput of 4. The same applies to DECB.
This patch adds patterns to prefer the cheaper INCB/DECB forms over
ADDVL where applicable.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list