[llvm] [AArch64][GlobalISel] Improve non-SVE popcount for 32bit and 64 bit using udot (PR #96409)

Tim Gymnich via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 25 05:46:42 PDT 2024


================
@@ -1904,6 +1905,31 @@ bool AArch64LegalizerInfo::legalizeCTPOP(MachineInstr &MI,
   auto CTPOP = MIRBuilder.buildCTPOP(VTy, Val);
 
   // Sum across lanes.
+
+  if (ST->hasDotProd() && Ty.isVector() && Ty.getNumElements() >= 2 &&
+      Ty.getScalarSizeInBits() != 16) {
+    LLT Dt = Ty == LLT::fixed_vector(2, 64) ? LLT::fixed_vector(4, 32) : Ty;
+    auto Zeros = MIRBuilder.buildConstant(Dt, 0);
+    auto Ones = MIRBuilder.buildConstant(VTy, 1);
+    MachineInstrBuilder SUM;
----------------
tgymnich wrote:

fixed

https://github.com/llvm/llvm-project/pull/96409


More information about the llvm-commits mailing list