[llvm] [AMDGPU] SIPeepholeSDWA: Handle V_CNDMASK_B32_e64 (PR #137930)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 30 02:25:36 PDT 2025


================
@@ -0,0 +1,48 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc %s -mtriple=amdgcn -mcpu=gfx900 -run-pass=si-peephole-sdwa -o - | FileCheck -check-prefix=gfx9 %s
+
+# Test conversion of V_CNDMASK_B32 to VOPC for enabling further conversion to SDWA.
+# For this, the definition of the src2 carry-in operand must changed to write
+# to VCC.
+
+---
+name:            v_vselect_v2bf16
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $vgpr0, $vgpr1, $vgpr2
+
+    ; gfx9-LABEL: name: v_vselect_v2bf16
+    ; gfx9: liveins: $vgpr0, $vgpr1, $vgpr2
+    ; gfx9-NEXT: {{  $}}
+    ; gfx9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr2
+    ; gfx9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
+    ; gfx9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr0
+    ; gfx9-NEXT: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 1, [[COPY1]], implicit $exec
+    ; gfx9-NEXT: [[V_AND_B32_e64_1:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 1, [[COPY2]], implicit $exec
+    ; gfx9-NEXT: [[V_CMP_EQ_U32_e64_:%[0-9]+]]:sreg_64_xexec = V_CMP_EQ_U32_e64 killed [[V_AND_B32_e64_1]], 1, implicit $exec
+    ; gfx9-NEXT: [[V_CNDMASK_B32_e64_:%[0-9]+]]:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, [[COPY]], killed [[V_CMP_EQ_U32_e64_]], implicit $exec
+    ; gfx9-NEXT: [[V_LSHRREV_B32_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B32_e64 16, [[COPY]], implicit $exec
+    ; gfx9-NEXT: $vcc = V_CMP_EQ_U32_e64 killed [[V_AND_B32_e64_]], 1, implicit $exec
+    ; gfx9-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
+    ; gfx9-NEXT: [[V_CNDMASK_B32_sdwa:%[0-9]+]]:vgpr_32 = V_CNDMASK_B32_sdwa 0, [[V_MOV_B32_e32_]], 0, [[COPY]], 0, 6, 0, 6, 5, implicit $vcc, implicit $exec
+    ; gfx9-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 84148480
+    ; gfx9-NEXT: [[V_PERM_B32_e64_:%[0-9]+]]:vgpr_32 = V_PERM_B32_e64 killed [[V_CNDMASK_B32_sdwa]], killed [[V_CNDMASK_B32_e64_]], killed [[S_MOV_B32_]], implicit $exec
+    ; gfx9-NEXT: $vgpr0 = COPY [[V_PERM_B32_e64_]]
+    ; gfx9-NEXT: SI_RETURN implicit $vgpr0
+    %10:vgpr_32 = COPY $vgpr2
+    %9:vgpr_32 = COPY $vgpr1
+    %8:vgpr_32 = COPY $vgpr0
+    %11:vgpr_32 = V_AND_B32_e64 1, %9, implicit $exec
+    %12:sreg_64_xexec = V_CMP_EQ_U32_e64 killed %11, 1, implicit $exec
+    %13:vgpr_32 = V_AND_B32_e64 1, %8, implicit $exec
+    %14:sreg_64_xexec = V_CMP_EQ_U32_e64 killed %13, 1, implicit $exec
+    %17:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, %10, killed %14, implicit $exec
+    %20:vgpr_32 = V_LSHRREV_B32_e64 16, %10, implicit $exec
+    %22:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, %20, killed %12, implicit $exec
+    %24:sreg_32 = S_MOV_B32 84148480
+    %25:vgpr_32 = V_PERM_B32_e64 killed %22, killed %17, killed %24, implicit $exec
----------------
arsenm wrote:

Remove the unrelated instructions 

https://github.com/llvm/llvm-project/pull/137930


More information about the llvm-commits mailing list