[clang] [llvm] [AMDGPU] Change CF intrinsics lowering to reconverge on predecessors (PR #108596)
via cfe-commits
cfe-commits at lists.llvm.org
Wed Oct 16 03:20:57 PDT 2024
================
@@ -446,8 +474,10 @@ define amdgpu_kernel void @add_i32_uniform(ptr addrspace(1) %out, ptr addrspace(
; GFX11W64-NEXT: ; implicit-def: $vgpr1
; GFX11W64-NEXT: s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
; GFX11W64-NEXT: v_mbcnt_hi_u32_b32 v0, s5, v0
-; GFX11W64-NEXT: v_cmpx_eq_u32_e32 0, v0
-; GFX11W64-NEXT: s_cbranch_execz .LBB1_2
+; GFX11W64-NEXT: v_cmp_eq_u32_e32 vcc, 0, v0
+; GFX11W64-NEXT: s_cmp_lg_u64 vcc, 0
+; GFX11W64-NEXT: s_cmov_b64 exec, vcc
+; GFX11W64-NEXT: s_cbranch_scc0 .LBB1_2
----------------
alex-t wrote:
No, we cannot. S_AND_SAVEEXEC changes the EXEC unconditionally.
The idea is that we only change the exec if we are going to execute "Then" block but leave it unchanged for Flow block. All further lowering is based on this assumption.
https://github.com/llvm/llvm-project/pull/108596
More information about the cfe-commits
mailing list