[llvm] [AArch64][SVE] Fold ADD+CNTB to INCB and DECB (PR #118280)
Sjoerd Meijer via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 9 01:53:03 PST 2024
sjoerdmeijer wrote:
If we always emit incb, then it should give the same or better performance. As long as that's the case, then we're fine.
As you mentioned, it looks like there are a couple of regressions as shown in one of the test cases, instead of:
addvl x9, x8, #1
we now get this which at first sight doesn't look an improvement:
mov x9, x8
incb x9
However, this sequence has a latency of 1 because the MOV is a zero-latency move (on the V2)? So I think this is actually an improvement too, isn't that right?
https://github.com/llvm/llvm-project/pull/118280
More information about the llvm-commits
mailing list