[llvm] [AMDGPU] Clear kills for aliasing registers after forming V_CMPX (PR #68675)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 10 01:42:23 PDT 2023


https://github.com/jayfoad created https://github.com/llvm/llvm-project/pull/68675

This fixes machine verification problems when V_CMPX formation
effectively moves a "V_CMP vgprN" instruction past other instructions
that may kill a subregister or superregister of vgprN.

Fixes #68221


>From 0840605903b25957ec7ed651c88572fa03b4fc41 Mon Sep 17 00:00:00 2001
From: Jay Foad <jay.foad at amd.com>
Date: Tue, 10 Oct 2023 09:37:23 +0100
Subject: [PATCH] [AMDGPU] Clear kills for aliasing registers after forming
 V_CMPX

This fixes machine verification problems when V_CMPX formation
effectively moves a "V_CMP vgprN" instruction past other instructions
that may kill a subregister or superregister of vgprN.

Fixes #68221
---
 .../Target/AMDGPU/SIOptimizeExecMasking.cpp   | 12 ++++--
 .../vcmp-saveexec-to-vcmpx-set-kill.mir       | 41 +++++++++++++++++++
 2 files changed, 49 insertions(+), 4 deletions(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx-set-kill.mir

diff --git a/llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp b/llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp
index 04c9a6457944c5f..4730cec8d9abceb 100644
--- a/llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp
+++ b/llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp
@@ -594,10 +594,14 @@ bool SIOptimizeExecMasking::optimizeVCMPSaveExecSequence(
   TryAddImmediateValueFromNamedOperand(AMDGPU::OpName::clamp);
 
   // The kill flags may no longer be correct.
-  if (Src0->isReg())
-    MRI->clearKillFlags(Src0->getReg());
-  if (Src1->isReg())
-    MRI->clearKillFlags(Src1->getReg());
+  if (Src0->isReg()) {
+    for (MCRegAliasIterator I(Src0->getReg(), TRI, true); I.isValid(); ++I)
+      MRI->clearKillFlags(*I);
+  }
+  if (Src1->isReg()) {
+    for (MCRegAliasIterator I(Src1->getReg(), TRI, true); I.isValid(); ++I)
+      MRI->clearKillFlags(*I);
+  }
 
   SaveExecInstr.eraseFromParent();
   VCmp.eraseFromParent();
diff --git a/llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx-set-kill.mir b/llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx-set-kill.mir
new file mode 100644
index 000000000000000..0b9f824775cc3c9
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx-set-kill.mir
@@ -0,0 +1,41 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1100 -run-pass si-optimize-exec-masking -verify-machineinstrs %s -o - | FileCheck %s
+
+---
+name: main
+tracksRegLiveness: true
+body:             |
+  ; CHECK-LABEL: name: main
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT:   liveins: $vgpr0_vgpr1, $sgpr0_sgpr1_sgpr2_sgpr3
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   BUFFER_STORE_DWORDX2_OFFSET_exact $vgpr0_vgpr1, killed $sgpr0_sgpr1_sgpr2_sgpr3, 0, 344, 1, 0, implicit $exec
+  ; CHECK-NEXT:   $sgpr0 = S_MOV_B32 $exec_lo
+  ; CHECK-NEXT:   V_CMPX_EQ_U32_nosdst_e64 0, $vgpr0, implicit-def $exec, implicit $exec
+  ; CHECK-NEXT:   S_CBRANCH_EXECZ %bb.2, implicit $exec
+  ; CHECK-NEXT:   S_BRANCH %bb.1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_ENDPGM 0
+  bb.0:
+    successors: %bb.1, %bb.2
+    liveins: $vgpr0_vgpr1, $sgpr0_sgpr1_sgpr2_sgpr3
+
+    $vcc_lo = V_CMP_EQ_U32_e64 0, $vgpr0, implicit $exec
+    BUFFER_STORE_DWORDX2_OFFSET_exact killed $vgpr0_vgpr1, killed $sgpr0_sgpr1_sgpr2_sgpr3, 0, 344, 1, 0, implicit $exec
+    $sgpr0 = COPY $exec_lo, implicit-def $exec_lo
+    $sgpr0 = S_AND_B32 killed $sgpr0, killed $vcc_lo, implicit-def dead $scc
+    $exec_lo = S_MOV_B32_term killed $sgpr0
+    S_CBRANCH_EXECZ %bb.2, implicit $exec
+    S_BRANCH %bb.1
+
+  bb.1:
+
+  bb.2:
+    S_ENDPGM 0
+...



More information about the llvm-commits mailing list