[llvm] [AArch64][SVE] Fold ADD+CNTB to INCB and DECB (PR #118280)

David Sherwood via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 9 02:00:50 PST 2024


david-arm wrote:

> If we always emit incb, then it should give the same or better performance. As long as that's the case, then we're fine.

Hi @sjoerdmeijer, but it's really not obvious to me that incb gives better performance for immediate values that aren't 1, 2 or 4. In fact, I wouldn't be surprised if it gave worse performance in some circumstances which is why I wonder if we should be more cautious here? That's because incb requires the src and dest registers to be the same, so that could impact the register allocator and scheduler. I don't think we can always just consider instructions in isolation by looking at the latency and throughput.

https://github.com/llvm/llvm-project/pull/118280


More information about the llvm-commits mailing list