[llvm] [AMDGPU] Fold into uses of splat REG_SEQUENCEs through COPYs. (PR #145691)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 26 01:01:26 PDT 2025
================
@@ -2155,11 +2155,8 @@ define amdgpu_kernel void @fadd_fadd_fsub_0(<2 x float> %arg) {
; GFX90A-GISEL-LABEL: fadd_fadd_fsub_0:
; GFX90A-GISEL: ; %bb.0: ; %bb
; GFX90A-GISEL-NEXT: s_load_dwordx2 s[0:1], s[4:5], 0x24
-; GFX90A-GISEL-NEXT: s_mov_b32 s2, 0
-; GFX90A-GISEL-NEXT: s_mov_b32 s3, s2
-; GFX90A-GISEL-NEXT: v_pk_mov_b32 v[0:1], s[2:3], s[2:3] op_sel:[0,1]
; GFX90A-GISEL-NEXT: s_waitcnt lgkmcnt(0)
-; GFX90A-GISEL-NEXT: v_pk_add_f32 v[0:1], s[0:1], v[0:1]
+; GFX90A-GISEL-NEXT: v_pk_add_f32 v[0:1], s[0:1], 0
----------------
arsenm wrote:
The only test change is one globalisel case, so that's a hint this is just hiding a missed optimization that should have happened earlier in the pipeline. Do you have another example where this is useful?
https://github.com/llvm/llvm-project/pull/145691
More information about the llvm-commits
mailing list