[llvm] [AMDGPU] Use `S_BFE_U64` for uniform i1 ext (PR #69703)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 20 05:08:03 PDT 2023


================
@@ -29,10 +29,11 @@ define amdgpu_kernel void @saddo_i64_zext(ptr addrspace(1) %out, i64 %a, i64 %b)
 ; SI-NEXT:    s_mov_b32 s0, s4
 ; SI-NEXT:    s_mov_b32 s1, s5
 ; SI-NEXT:    s_xor_b64 s[4:5], s[6:7], vcc
-; SI-NEXT:    v_cndmask_b32_e64 v0, 0, 1, s[4:5]
-; SI-NEXT:    v_mov_b32_e32 v1, s11
-; SI-NEXT:    v_add_i32_e32 v0, vcc, s10, v0
-; SI-NEXT:    v_addc_u32_e32 v1, vcc, 0, v1, vcc
+; SI-NEXT:    s_bfe_u64 s[4:5], s[4:5], 0x10000
+; SI-NEXT:    s_add_u32 s4, s10, s4
+; SI-NEXT:    s_addc_u32 s5, s11, s5
+; SI-NEXT:    v_mov_b32_e32 v0, s4
+; SI-NEXT:    v_mov_b32_e32 v1, s5
----------------
jayfoad wrote:

No, that's not a good way to think about it. Uniformity is something the compiler can prove statically about the values in active lanes. It says nothing at all about what is happening in inactive lanes - they are outside of its domain.

https://github.com/llvm/llvm-project/pull/69703


More information about the llvm-commits mailing list