[llvm] [AMDGPU] Enable GCNRewritePartialRegUses pass by default. (PR #72975)
Petar Avramovic via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 21 05:59:40 PST 2023
================
@@ -50,15 +50,17 @@ define amdgpu_ps i128 @extractelement_vgpr_v4i128_sgpr_idx(ptr addrspace(1) %ptr
; GFX9-NEXT: s_set_gpr_idx_on s2, gpr_idx(SRC0)
; GFX9-NEXT: v_mov_b32_e32 v0, v2
; GFX9-NEXT: v_mov_b32_e32 v1, v3
-; GFX9-NEXT: v_mov_b32_e32 v18, v2
; GFX9-NEXT: s_set_gpr_idx_off
; GFX9-NEXT: v_readfirstlane_b32 s0, v0
+; GFX9-NEXT: s_set_gpr_idx_on s2, gpr_idx(SRC0)
+; GFX9-NEXT: v_mov_b32_e32 v0, v2
+; GFX9-NEXT: s_set_gpr_idx_off
; GFX9-NEXT: v_readfirstlane_b32 s1, v1
; GFX9-NEXT: s_set_gpr_idx_on s2, gpr_idx(SRC0)
-; GFX9-NEXT: v_mov_b32_e32 v3, v3
+; GFX9-NEXT: v_mov_b32_e32 v1, v3
; GFX9-NEXT: s_set_gpr_idx_off
-; GFX9-NEXT: v_readfirstlane_b32 s2, v18
-; GFX9-NEXT: v_readfirstlane_b32 s3, v3
+; GFX9-NEXT: v_readfirstlane_b32 s2, v0
+; GFX9-NEXT: v_readfirstlane_b32 s3, v1
----------------
petar-avramovic wrote:
V_INDIRECT_REG_READ_GPR_IDX_B32_V16 is expanded into bundle that starts with S_SET_GPR_IDX_ON and ends with S_SET_GPR_IDX_OFF
If there is nothing in between two bundles si-pre-emit-peephole will remove pair of S_SET_GPR_IDX_OFF/S_SET_GPR_IDX_ON
What happens here is that machine-scheduler moves V_READFIRSTLANE_B32 between V_INDIRECT_REG_READ_GPR_IDX_B32_V16 in a different way (there are more moves later)
We could get even better result there were no readfirstlanes between V_INDIRECT_REG_READ_GPR_IDX_B32_V16 (one pair of s_set_gpr_idx_on / s_set_gpr_idx_off)
s_set_gpr_idx_on s2, gpr_idx(SRC0)
v_mov_b32_e32 v0, v2
v_mov_b32_e32 v1, v3
v_mov_b32_e32 v18, v2
v_mov_b32_e32 v3, v3
s_set_gpr_idx_off
v_readfirstlane_b32 s0, v0
v_readfirstlane_b32 s1, v1
v_readfirstlane_b32 s2, v18
v_readfirstlane_b32 s3, v3
I see this as different issue
https://github.com/llvm/llvm-project/pull/72975
More information about the llvm-commits
mailing list