[llvm] [AMDGPU] SIPeepholeSDWA: Handle V_CNDMASK_B32_e64 (PR #137930)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 30 02:25:36 PDT 2025
================
@@ -0,0 +1,48 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc %s -mtriple=amdgcn -mcpu=gfx900 -run-pass=si-peephole-sdwa -o - | FileCheck -check-prefix=gfx9 %s
+
+# Test conversion of V_CNDMASK_B32 to VOPC for enabling further conversion to SDWA.
+# For this, the definition of the src2 carry-in operand must changed to write
+# to VCC.
+
+---
+name: v_vselect_v2bf16
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $vgpr0, $vgpr1, $vgpr2
+
+ ; gfx9-LABEL: name: v_vselect_v2bf16
+ ; gfx9: liveins: $vgpr0, $vgpr1, $vgpr2
+ ; gfx9-NEXT: {{ $}}
+ ; gfx9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr2
+ ; gfx9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
+ ; gfx9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr0
+ ; gfx9-NEXT: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 1, [[COPY1]], implicit $exec
+ ; gfx9-NEXT: [[V_AND_B32_e64_1:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 1, [[COPY2]], implicit $exec
+ ; gfx9-NEXT: [[V_CMP_EQ_U32_e64_:%[0-9]+]]:sreg_64_xexec = V_CMP_EQ_U32_e64 killed [[V_AND_B32_e64_1]], 1, implicit $exec
+ ; gfx9-NEXT: [[V_CNDMASK_B32_e64_:%[0-9]+]]:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, [[COPY]], killed [[V_CMP_EQ_U32_e64_]], implicit $exec
+ ; gfx9-NEXT: [[V_LSHRREV_B32_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B32_e64 16, [[COPY]], implicit $exec
+ ; gfx9-NEXT: $vcc = V_CMP_EQ_U32_e64 killed [[V_AND_B32_e64_]], 1, implicit $exec
+ ; gfx9-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
+ ; gfx9-NEXT: [[V_CNDMASK_B32_sdwa:%[0-9]+]]:vgpr_32 = V_CNDMASK_B32_sdwa 0, [[V_MOV_B32_e32_]], 0, [[COPY]], 0, 6, 0, 6, 5, implicit $vcc, implicit $exec
+ ; gfx9-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 84148480
+ ; gfx9-NEXT: [[V_PERM_B32_e64_:%[0-9]+]]:vgpr_32 = V_PERM_B32_e64 killed [[V_CNDMASK_B32_sdwa]], killed [[V_CNDMASK_B32_e64_]], killed [[S_MOV_B32_]], implicit $exec
+ ; gfx9-NEXT: $vgpr0 = COPY [[V_PERM_B32_e64_]]
+ ; gfx9-NEXT: SI_RETURN implicit $vgpr0
+ %10:vgpr_32 = COPY $vgpr2
+ %9:vgpr_32 = COPY $vgpr1
+ %8:vgpr_32 = COPY $vgpr0
+ %11:vgpr_32 = V_AND_B32_e64 1, %9, implicit $exec
+ %12:sreg_64_xexec = V_CMP_EQ_U32_e64 killed %11, 1, implicit $exec
+ %13:vgpr_32 = V_AND_B32_e64 1, %8, implicit $exec
+ %14:sreg_64_xexec = V_CMP_EQ_U32_e64 killed %13, 1, implicit $exec
+ %17:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, %10, killed %14, implicit $exec
+ %20:vgpr_32 = V_LSHRREV_B32_e64 16, %10, implicit $exec
+ %22:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, %20, killed %12, implicit $exec
+ %24:sreg_32 = S_MOV_B32 84148480
+ %25:vgpr_32 = V_PERM_B32_e64 killed %22, killed %17, killed %24, implicit $exec
----------------
arsenm wrote:
Remove the unrelated instructions
https://github.com/llvm/llvm-project/pull/137930
More information about the llvm-commits
mailing list