[llvm-branch-commits] [llvm] [AMDGPU][GlobalISel] Add COPY_SCC_VCC combine for VCC-SGPR-VGPR pattern (PR #179352)
Petar Avramovic via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Tue Feb 24 02:45:52 PST 2026
================
@@ -5,9 +5,8 @@ define amdgpu_ps void @test_fpclass_zext(float inreg %x, i32 %z, ptr addrspace(1
; CHECK-LABEL: test_fpclass_zext:
; CHECK: ; %bb.0:
; CHECK-NEXT: v_cmp_class_f32_e64 s[0:1], s2, 3
-; CHECK-NEXT: s_cmp_lg_u64 s[0:1], 0
-; CHECK-NEXT: s_cselect_b32 s0, 1, 0
-; CHECK-NEXT: v_add_u32_e32 v0, s0, v0
+; CHECK-NEXT: v_cndmask_b32_e64 v3, 0, 1, s[0:1]
+; CHECK-NEXT: v_add_u32_e32 v0, v3, v0
----------------
petar-avramovic wrote:
Ok, let's start weighting if this combine is worth it and in which cases.
For example here I would say it is questionable.
You trade
2 SALU inst
for
1 VALU inst + 1 extra vgpr
In general think we need some attempt to estimate number of new vgprs needed after the combine.
For example:
eliminating readanylane is -1 vgpr
change to VALU + deleting copy to vgpr is most probably +1 vgpr (most users of vgpr can also use sgpr)
eliminating constant (even though you make extra copy) is +0 vgprs
https://github.com/llvm/llvm-project/pull/179352
More information about the llvm-branch-commits
mailing list