[llvm] [AMDGPU] Simplify and improve codegen for llvm.amdgcn.set.inactive (PR #107889)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 9 09:43:38 PDT 2024
================
@@ -816,10 +816,7 @@ define amdgpu_kernel void @global_atomic_fadd_uni_address_div_value_agent_scope_
; GFX9-DPP-NEXT: v_mbcnt_hi_u32_b32 v1, exec_hi, v1
; GFX9-DPP-NEXT: s_or_saveexec_b64 s[0:1], -1
; GFX9-DPP-NEXT: v_bfrev_b32_e32 v3, 1
-; GFX9-DPP-NEXT: v_bfrev_b32_e32 v4, 1
-; GFX9-DPP-NEXT: s_mov_b64 exec, s[0:1]
-; GFX9-DPP-NEXT: v_mov_b32_e32 v4, v0
-; GFX9-DPP-NEXT: s_mov_b64 exec, -1
+; GFX9-DPP-NEXT: v_cndmask_b32_e64 v4, v3, v0, s[0:1]
----------------
jayfoad wrote:
Nice improvement here.
https://github.com/llvm/llvm-project/pull/107889
More information about the llvm-commits
mailing list