[llvm] [AMDGPU] Simplify and improve codegen for llvm.amdgcn.set.inactive (PR #107889)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 9 09:43:38 PDT 2024
================
@@ -178,6 +177,9 @@ define amdgpu_cs void @cfg(ptr addrspace(8) inreg %tmp14, i32 %arg) {
; GFX9-O0-NEXT: v_mov_b32_e32 v0, v4
; GFX9-O0-NEXT: s_or_saveexec_b64 s[0:1], -1
; GFX9-O0-NEXT: v_mov_b32_e32 v1, 0
+; GFX9-O0-NEXT: s_mov_b64 exec, s[0:1]
+; GFX9-O0-NEXT: ; implicit-def: $sgpr0_sgpr1
+; GFX9-O0-NEXT: s_or_saveexec_b64 s[0:1], -1
----------------
jayfoad wrote:
Unfortunate regression here. I'm not sure how to fix it. Maybe we need some peephole for redundant exec mask changes.
https://github.com/llvm/llvm-project/pull/107889
More information about the llvm-commits
mailing list