[llvm] [AMDGPU] Generate more swaps (PR #184164)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 3 07:45:58 PST 2026
================
@@ -7408,10 +7403,8 @@ define amdgpu_kernel void @sub_i64_varying(ptr addrspace(1) %out, ptr addrspace(
; GFX9_DPP-NEXT: v_mov_b32_e32 v10, v6
; GFX9_DPP-NEXT: v_subrev_co_u32_e32 v8, vcc, s10, v10
; GFX9_DPP-NEXT: v_subb_co_u32_e32 v9, vcc, v11, v0, vcc
-; GFX9_DPP-NEXT: v_mov_b32_e32 v6, v8
-; GFX9_DPP-NEXT: v_mov_b32_e32 v7, v9
-; GFX9_DPP-NEXT: v_mov_b32_e32 v8, v10
-; GFX9_DPP-NEXT: v_mov_b32_e32 v9, v11
+; GFX9_DPP-NEXT: v_swap_b32 v6, v8
+; GFX9_DPP-NEXT: v_swap_b32 v7, v9
----------------
jayfoad wrote:
The movs are unnecessary now because the subs could read directly from v6/v7 instead. But there's probably no good way to remove them without teaching the register allocator itself to generate swaps.
https://github.com/llvm/llvm-project/pull/184164
More information about the llvm-commits
mailing list