[llvm] [AMDGPU] Delete redundant s_or_b32 (PR #165261)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 4 18:50:56 PST 2025
================
@@ -2277,3 +2277,93 @@ body: |
S_ENDPGM 0
...
+
+---
+name: s_cselect_b64_s_or_b32_s_cmp_lg_u32_0x00000000
+body: |
+ ; GCN-LABEL: name: s_cselect_b64_s_or_b32_s_cmp_lg_u32_0x00000000
+ ; GCN: bb.0:
+ ; GCN-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
+ ; GCN-NEXT: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
+ ; GCN-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY [[DEF]]
+ ; GCN-NEXT: S_CMP_LG_U32 [[COPY]], 0, implicit-def $scc
+ ; GCN-NEXT: [[S_CSELECT_B64_:%[0-9]+]]:sreg_64_xexec = S_CSELECT_B64 -1, 0, implicit $scc
+ ; GCN-NEXT: [[COPY1:%[0-9]+]]:sreg_32_xm0_xexec = COPY [[S_CSELECT_B64_]].sub0
+ ; GCN-NEXT: [[COPY2:%[0-9]+]]:sreg_32_xm0_xexec = COPY [[S_CSELECT_B64_]].sub1
+ ; GCN-NEXT: S_CBRANCH_SCC0 %bb.2, implicit $scc
+ ; GCN-NEXT: S_BRANCH %bb.1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: bb.1:
+ ; GCN-NEXT: successors: %bb.2(0x80000000)
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: bb.2:
+ ; GCN-NEXT: S_ENDPGM 0
+ bb.0:
+ successors: %bb.1(0x40000000), %bb.2(0x40000000)
+ liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+ %0:vgpr_32 = IMPLICIT_DEF
+ %2:sreg_32 = COPY %0
+ S_CMP_LG_U32 %2, 0, implicit-def $scc
+ %31:sreg_64_xexec = S_CSELECT_B64 -1, 0, implicit $scc
+ %40:sreg_32_xm0_xexec = COPY %31.sub0:sreg_64_xexec
+ %41:sreg_32_xm0_xexec = COPY %31.sub1:sreg_64_xexec
+ %sgpr4:sreg_32 = S_OR_B32 %40:sreg_32_xm0_xexec, %41:sreg_32_xm0_xexec, implicit-def $scc
+ S_CMP_LG_U32 %sgpr4, 0, implicit-def $scc
+ S_CBRANCH_SCC0 %bb.2, implicit $scc
+ S_BRANCH %bb.1
+
+ bb.1:
+ successors: %bb.2(0x80000000)
+
+ bb.2:
+ S_ENDPGM 0
+
+...
+
+---
+# Do not delete s_or_b32 since both operands are sub1.
+name: s_cselect_b64_s_or_b32_s_cmp_lg_u32_0x00000000_cant_optimize
+body: |
+ ; GCN-LABEL: name: s_cselect_b64_s_or_b32_s_cmp_lg_u32_0x00000000_cant_optimize
+ ; GCN: bb.0:
+ ; GCN-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
+ ; GCN-NEXT: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
+ ; GCN-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY [[DEF]]
+ ; GCN-NEXT: S_CMP_LG_U32 [[COPY]], 0, implicit-def $scc
+ ; GCN-NEXT: [[S_CSELECT_B64_:%[0-9]+]]:sreg_64_xexec = S_CSELECT_B64 1, 0, implicit $scc
+ ; GCN-NEXT: [[COPY1:%[0-9]+]]:sreg_32_xm0_xexec = COPY [[S_CSELECT_B64_]].sub1
+ ; GCN-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_CSELECT_B64_]].sub1
+ ; GCN-NEXT: %sgpr4:sreg_32 = S_OR_B32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GCN-NEXT: S_CBRANCH_SCC0 %bb.2, implicit $scc
+ ; GCN-NEXT: S_BRANCH %bb.1
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: bb.1:
+ ; GCN-NEXT: successors: %bb.2(0x80000000)
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: bb.2:
+ ; GCN-NEXT: S_ENDPGM 0
+ bb.0:
+ successors: %bb.1(0x40000000), %bb.2(0x40000000)
+ liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
+ %0:vgpr_32 = IMPLICIT_DEF
+ %2:sreg_32 = COPY %0
+ S_CMP_LG_U32 %2, 0, implicit-def $scc
+ %31:sreg_64_xexec = S_CSELECT_B64 1, 0, implicit $scc
+ %40:sreg_32_xm0_xexec = COPY %31.sub1:sreg_64_xexec
+ %41:sreg_32 = COPY %31.sub1:sreg_64_xexec
+ %sgpr4:sreg_32 = S_OR_B32 %40:sreg_32_xm0_xexec, %41:sreg_32, implicit-def $scc
+ S_CMP_LG_U32 %sgpr4, 0, implicit-def $scc
+ S_CBRANCH_SCC0 %bb.2, implicit $scc
+ S_BRANCH %bb.1
+
+ bb.1:
+ successors: %bb.2(0x80000000)
+
+ bb.2:
+ S_ENDPGM 0
+
+...
----------------
arsenm wrote:
Can you test an undef use? (I do want to eventually fix the machine verifier to disallow getVRegDef to fail on virtual registers in SSA)
https://github.com/llvm/llvm-project/pull/165261
More information about the llvm-commits
mailing list