[llvm] [AArch64][GlobalISel] Improve non-SVE popcount for 32bit and 64 bit using udot (PR #96409)
Tim Gymnich via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 25 05:46:42 PDT 2024
================
@@ -1904,6 +1905,31 @@ bool AArch64LegalizerInfo::legalizeCTPOP(MachineInstr &MI,
auto CTPOP = MIRBuilder.buildCTPOP(VTy, Val);
// Sum across lanes.
+
+ if (ST->hasDotProd() && Ty.isVector() && Ty.getNumElements() >= 2 &&
+ Ty.getScalarSizeInBits() != 16) {
+ LLT Dt = Ty == LLT::fixed_vector(2, 64) ? LLT::fixed_vector(4, 32) : Ty;
+ auto Zeros = MIRBuilder.buildConstant(Dt, 0);
+ auto Ones = MIRBuilder.buildConstant(VTy, 1);
+ MachineInstrBuilder SUM;
----------------
tgymnich wrote:
fixed
https://github.com/llvm/llvm-project/pull/96409
More information about the llvm-commits
mailing list