[llvm] AMDGPU: Replace copy-to-mov-imm folding logic with class compat checks (PR #154501)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 26 04:05:22 PDT 2025


================
@@ -383,14 +384,15 @@ define amdgpu_ps half @v_interp_f16_imm_params(float inreg %i, float inreg %j) #
 ;
 ; GFX12-TRUE16-LABEL: v_interp_f16_imm_params:
 ; GFX12-TRUE16:       ; %bb.0: ; %main_body
-; GFX12-TRUE16-NEXT:    v_mov_b16_e32 v0.l, 0
-; GFX12-TRUE16-NEXT:    v_dual_mov_b32 v1, s0 :: v_dual_mov_b32 v2, 0
+; GFX12-TRUE16-NEXT:    v_dual_mov_b32 v1, 0 :: v_dual_mov_b32 v2, s0
 ; GFX12-TRUE16-NEXT:    v_mov_b32_e32 v3, s1
-; GFX12-TRUE16-NEXT:    s_delay_alu instid0(VALU_DEP_2) | instskip(NEXT) | instid1(VALU_DEP_2)
-; GFX12-TRUE16-NEXT:    v_interp_p10_f16_f32 v1, v0.l, v1, v0.l wait_exp:7
-; GFX12-TRUE16-NEXT:    v_interp_p2_f16_f32 v0.l, v0.l, v3, v2 wait_exp:7
 ; GFX12-TRUE16-NEXT:    s_delay_alu instid0(VALU_DEP_2) | instskip(NEXT) | instid1(VALU_DEP_1)
-; GFX12-TRUE16-NEXT:    v_cvt_f16_f32_e32 v0.h, v1
+; GFX12-TRUE16-NEXT:    v_mov_b16_e32 v0.l, v1.l
+; GFX12-TRUE16-NEXT:    v_interp_p10_f16_f32 v2, v0.l, v2, v0.l wait_exp:7
+; GFX12-TRUE16-NEXT:    s_delay_alu instid0(VALU_DEP_3) | instskip(NEXT) | instid1(VALU_DEP_2)
+; GFX12-TRUE16-NEXT:    v_interp_p2_f16_f32 v0.l, v0.l, v3, v1 wait_exp:7
+; GFX12-TRUE16-NEXT:    v_cvt_f16_f32_e32 v0.h, v2
+; GFX12-TRUE16-NEXT:    s_delay_alu instid0(VALU_DEP_1)
----------------
arsenm wrote:

Not the concern of immediate folding 

https://github.com/llvm/llvm-project/pull/154501


More information about the llvm-commits mailing list