[llvm] [AMDGPU] Enable GCNRewritePartialRegUses pass by default. (PR #72975)

Petar Avramovic via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 21 05:59:40 PST 2023


================
@@ -50,15 +50,17 @@ define amdgpu_ps i128 @extractelement_vgpr_v4i128_sgpr_idx(ptr addrspace(1) %ptr
 ; GFX9-NEXT:    s_set_gpr_idx_on s2, gpr_idx(SRC0)
 ; GFX9-NEXT:    v_mov_b32_e32 v0, v2
 ; GFX9-NEXT:    v_mov_b32_e32 v1, v3
-; GFX9-NEXT:    v_mov_b32_e32 v18, v2
 ; GFX9-NEXT:    s_set_gpr_idx_off
 ; GFX9-NEXT:    v_readfirstlane_b32 s0, v0
+; GFX9-NEXT:    s_set_gpr_idx_on s2, gpr_idx(SRC0)
+; GFX9-NEXT:    v_mov_b32_e32 v0, v2
+; GFX9-NEXT:    s_set_gpr_idx_off
 ; GFX9-NEXT:    v_readfirstlane_b32 s1, v1
 ; GFX9-NEXT:    s_set_gpr_idx_on s2, gpr_idx(SRC0)
-; GFX9-NEXT:    v_mov_b32_e32 v3, v3
+; GFX9-NEXT:    v_mov_b32_e32 v1, v3
 ; GFX9-NEXT:    s_set_gpr_idx_off
-; GFX9-NEXT:    v_readfirstlane_b32 s2, v18
-; GFX9-NEXT:    v_readfirstlane_b32 s3, v3
+; GFX9-NEXT:    v_readfirstlane_b32 s2, v0
+; GFX9-NEXT:    v_readfirstlane_b32 s3, v1
----------------
petar-avramovic wrote:

V_INDIRECT_REG_READ_GPR_IDX_B32_V16 is expanded into bundle that starts with S_SET_GPR_IDX_ON and ends with S_SET_GPR_IDX_OFF
If there is nothing in between two bundles si-pre-emit-peephole will remove pair of S_SET_GPR_IDX_OFF/S_SET_GPR_IDX_ON

What happens here is that machine-scheduler moves V_READFIRSTLANE_B32 between V_INDIRECT_REG_READ_GPR_IDX_B32_V16 in a different way (there are more moves later)

We could get even better result there were no readfirstlanes between V_INDIRECT_REG_READ_GPR_IDX_B32_V16 (one pair of s_set_gpr_idx_on / s_set_gpr_idx_off)

	s_set_gpr_idx_on s2, gpr_idx(SRC0)
	v_mov_b32_e32 v0, v2
	v_mov_b32_e32 v1, v3
	v_mov_b32_e32 v18, v2
	v_mov_b32_e32 v3, v3
	s_set_gpr_idx_off
	v_readfirstlane_b32 s0, v0
	v_readfirstlane_b32 s1, v1
	v_readfirstlane_b32 s2, v18
	v_readfirstlane_b32 s3, v3

I see this as different issue

https://github.com/llvm/llvm-project/pull/72975


More information about the llvm-commits mailing list