[all-commits] [llvm/llvm-project] 8f0ba6: [X86] Add X64 test coverage to smul-with-overflow.ll
Simon Pilgrim via All-commits
all-commits at lists.llvm.org
Fri Jul 22 09:36:14 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 8f0ba6c40527f705adebf43d1e1eb14713b912dd
https://github.com/llvm/llvm-project/commit/8f0ba6c40527f705adebf43d1e1eb14713b912dd
Author: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: 2022-07-22 (Fri, 22 Jul 2022)
Changed paths:
M llvm/test/CodeGen/X86/smul-with-overflow.ll
Log Message:
-----------
[X86] Add X64 test coverage to smul-with-overflow.ll
Commit: 939cf9b1bea4b5daee1d1b63860b0e958703656f
https://github.com/llvm/llvm-project/commit/939cf9b1bea4b5daee1d1b63860b0e958703656f
Author: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: 2022-07-22 (Fri, 22 Jul 2022)
Changed paths:
M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
M llvm/lib/Target/AArch64/AArch64ISelLowering.h
M llvm/test/CodeGen/AArch64/parity.ll
Log Message:
-----------
[AArch64] Use neon instructions for i64/i128 ISD::PARITY calculation
As noticed on D129765 and reported on Issue #56531 - aarch64 targets can use the neon ctpop + add-reduce instructions to speed up scalar ctpop instructions, but we fail to do this for parity calculations.
I'm not sure where the cutoff should be for specific CPUs, but i64 (+ i128 special case) shows a definite reduction in instruction count. i32 is about the same (but scalar <-> neon transfers are probably more costly?), and sub-i32 promotion looks to be a definite regression compared to parity expansion optimized for those widths.
Differential Revision: https://reviews.llvm.org/D130246
Compare: https://github.com/llvm/llvm-project/compare/7b81a81d5f9c...939cf9b1bea4
More information about the All-commits
mailing list