[llvm-branch-commits] [llvm] [AMDGPU][GlobalISel] Add COPY_SCC_VCC combine for VCC-SGPR-VGPR pattern (PR #179352)

Petar Avramovic via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Tue Feb 24 02:45:52 PST 2026


================
@@ -5,9 +5,8 @@ define amdgpu_ps void @test_fpclass_zext(float inreg %x, i32 %z, ptr addrspace(1
 ; CHECK-LABEL: test_fpclass_zext:
 ; CHECK:       ; %bb.0:
 ; CHECK-NEXT:    v_cmp_class_f32_e64 s[0:1], s2, 3
-; CHECK-NEXT:    s_cmp_lg_u64 s[0:1], 0
-; CHECK-NEXT:    s_cselect_b32 s0, 1, 0
-; CHECK-NEXT:    v_add_u32_e32 v0, s0, v0
+; CHECK-NEXT:    v_cndmask_b32_e64 v3, 0, 1, s[0:1]
+; CHECK-NEXT:    v_add_u32_e32 v0, v3, v0
----------------
petar-avramovic wrote:

Ok, let's start weighting if this combine is worth it and in which cases.
For example here I would say it is questionable.
You trade
2 SALU inst
for
1 VALU inst + 1 extra vgpr

In general think we need some attempt to estimate number of new vgprs needed after the combine.

For example:
eliminating readanylane is -1 vgpr
change to VALU + deleting copy to vgpr is most probably +1 vgpr (most users of vgpr can also use sgpr)
eliminating constant (even though you make extra copy) is +0 vgprs 

https://github.com/llvm/llvm-project/pull/179352


More information about the llvm-branch-commits mailing list