[clang] [llvm] Match bitsin(typeof(x)) - popcnt(x) to s_bcnt0_i32 on AMDGPU (PR #164847)
Patrick Simmons via cfe-commits
cfe-commits at lists.llvm.org
Wed Oct 29 09:39:33 PDT 2025
================
@@ -465,17 +463,16 @@ define amdgpu_ps i32 @bcnt032(i32 inreg %val0) {
define amdgpu_ps i32 @bcnt064(i64 inreg %val0) {
; CHECK-LABEL: bcnt064:
; CHECK: ; %bb.0:
-; CHECK-NEXT: s_bcnt1_i32_b64 s0, s[0:1]
-; CHECK-NEXT: s_sub_u32 s0, 64, s0
-; CHECK-NEXT: s_subb_u32 s1, 0, 0
-; CHECK-NEXT: s_cmp_lg_u64 s[0:1], 0
-; CHECK-NEXT: ;;#ASMSTART
-; CHECK-NEXT: ; use s[0:1]
-; CHECK-NEXT: ;;#ASMEND
-; CHECK-NEXT: s_cselect_b64 s[0:1], -1, 0
-; CHECK-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[0:1]
-; CHECK-NEXT: v_readfirstlane_b32 s0, v0
-; CHECK-NEXT: ; return to shader part epilog
+; CHECK-NEXT: s_bcnt0_i32_b64 s0, s[0:1]
+; CHECK-NEXT: s_mov_b32 s1, 0
+; CHECK-NEXT: s_cmp_lg_u64 s[0:1], 0
----------------
linuxrocks123 wrote:
@jayfoad yes, I noticed this, too. I think what we really should do here is reverse the order of the MOV and the BCNT instructions. If we did that, we could eliminate the comparison instruction entirely since BCNT already updates SCC.
I think that work belongs in a separate PR that runs on the Machine IR shortly after ISel.
https://github.com/llvm/llvm-project/pull/164847
More information about the cfe-commits
mailing list