[clang] [llvm] [AMDGPU] Change CF intrinsics lowering to reconverge on predecessors (PR #108596)

Nicolai Hähnle via cfe-commits cfe-commits at lists.llvm.org
Mon Sep 16 05:50:27 PDT 2024


================
@@ -196,8 +208,10 @@ define amdgpu_kernel void @add_i32_constant(ptr addrspace(1) %out, ptr addrspace
 ; GFX11W32-NEXT:    v_mbcnt_lo_u32_b32 v0, s1, 0
 ; GFX11W32-NEXT:    ; implicit-def: $vgpr1
 ; GFX11W32-NEXT:    s_delay_alu instid0(VALU_DEP_1)
-; GFX11W32-NEXT:    v_cmpx_eq_u32_e32 0, v0
-; GFX11W32-NEXT:    s_cbranch_execz .LBB0_2
+; GFX11W32-NEXT:    v_cmp_eq_u32_e32 vcc_lo, 0, v0
----------------
nhaehnle wrote:

That doesn't actually help much, because now we have an additional stall before the next VALU instruction.

https://github.com/llvm/llvm-project/pull/108596


More information about the cfe-commits mailing list