[llvm] [AMDGPU] Use `S_BFE_U64` for uniform i1 ext (PR #69703)

Wed Oct 25 04:44:58 PDT 2023

================
@@ -29,10 +29,11 @@ define amdgpu_kernel void @saddo_i64_zext(ptr addrspace(1) %out, i64 %a, i64 %b)
 ; SI-NEXT:    s_mov_b32 s0, s4
 ; SI-NEXT:    s_mov_b32 s1, s5
 ; SI-NEXT:    s_xor_b64 s[4:5], s[6:7], vcc
-; SI-NEXT:    v_cndmask_b32_e64 v0, 0, 1, s[4:5]
-; SI-NEXT:    v_mov_b32_e32 v1, s11
-; SI-NEXT:    v_add_i32_e32 v0, vcc, s10, v0
-; SI-NEXT:    v_addc_u32_e32 v1, vcc, 0, v1, vcc
+; SI-NEXT:    s_bfe_u64 s[4:5], s[4:5], 0x10000
+; SI-NEXT:    s_add_u32 s4, s10, s4
+; SI-NEXT:    s_addc_u32 s5, s11, s5
+; SI-NEXT:    v_mov_b32_e32 v0, s4
+; SI-NEXT:    v_mov_b32_e32 v1, s5
----------------
jayfoad wrote:

> I'm leaning towards fixing this in SIFoldOperands instead. I'm starting to think this isn't fixable in DAGISel because we have no way to say "this is a VGPR, but it's uniform". What do you think?

I really don't like adding more complexity to SIFoldOperands to fix up poor instruction selection, when it should have been possible to do better selection in the first place. But I'm not sure how to make progress on the current patch either.

https://github.com/llvm/llvm-project/pull/69703