[llvm] [AMDGPU] Use `S_BFE_U64` for uniform i1 ext (PR #69703)
Pierre van Houtryve via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 25 00:39:18 PDT 2023
================
@@ -29,10 +29,11 @@ define amdgpu_kernel void @saddo_i64_zext(ptr addrspace(1) %out, i64 %a, i64 %b)
; SI-NEXT: s_mov_b32 s0, s4
; SI-NEXT: s_mov_b32 s1, s5
; SI-NEXT: s_xor_b64 s[4:5], s[6:7], vcc
-; SI-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
-; SI-NEXT: v_mov_b32_e32 v1, s11
-; SI-NEXT: v_add_i32_e32 v0, vcc, s10, v0
-; SI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc
+; SI-NEXT: s_bfe_u64 s[4:5], s[4:5], 0x10000
+; SI-NEXT: s_add_u32 s4, s10, s4
+; SI-NEXT: s_addc_u32 s5, s11, s5
+; SI-NEXT: v_mov_b32_e32 v0, s4
+; SI-NEXT: v_mov_b32_e32 v1, s5
----------------
Pierre-vh wrote:
GISel still gets this one right because it marks uniforrms `G_ICMP` as using a VGPR reg bank if it knows they wont use SCC. So for the example above, GISel would use a V_CNDMASK + V_READFIRSTLANE but if the condcode is eq/ne it'll use the S_BFE form.
I'm leaning towards fixing this in SIFoldOperands instead. I'm starting to think this isn't fixable in DAGISel because we have no way to say "this is a VGPR, but it's uniform". What do you think?
https://github.com/llvm/llvm-project/pull/69703
More information about the llvm-commits
mailing list