[llvm] [AMDGPU] Clear kills for aliasing registers after forming V_CMPX (PR #68675)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 10 01:42:23 PDT 2023
https://github.com/jayfoad created https://github.com/llvm/llvm-project/pull/68675
This fixes machine verification problems when V_CMPX formation
effectively moves a "V_CMP vgprN" instruction past other instructions
that may kill a subregister or superregister of vgprN.
Fixes #68221
>From 0840605903b25957ec7ed651c88572fa03b4fc41 Mon Sep 17 00:00:00 2001
From: Jay Foad <jay.foad at amd.com>
Date: Tue, 10 Oct 2023 09:37:23 +0100
Subject: [PATCH] [AMDGPU] Clear kills for aliasing registers after forming
V_CMPX
This fixes machine verification problems when V_CMPX formation
effectively moves a "V_CMP vgprN" instruction past other instructions
that may kill a subregister or superregister of vgprN.
Fixes #68221
---
.../Target/AMDGPU/SIOptimizeExecMasking.cpp | 12 ++++--
.../vcmp-saveexec-to-vcmpx-set-kill.mir | 41 +++++++++++++++++++
2 files changed, 49 insertions(+), 4 deletions(-)
create mode 100644 llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx-set-kill.mir
diff --git a/llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp b/llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp
index 04c9a6457944c5f..4730cec8d9abceb 100644
--- a/llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp
+++ b/llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp
@@ -594,10 +594,14 @@ bool SIOptimizeExecMasking::optimizeVCMPSaveExecSequence(
TryAddImmediateValueFromNamedOperand(AMDGPU::OpName::clamp);
// The kill flags may no longer be correct.
- if (Src0->isReg())
- MRI->clearKillFlags(Src0->getReg());
- if (Src1->isReg())
- MRI->clearKillFlags(Src1->getReg());
+ if (Src0->isReg()) {
+ for (MCRegAliasIterator I(Src0->getReg(), TRI, true); I.isValid(); ++I)
+ MRI->clearKillFlags(*I);
+ }
+ if (Src1->isReg()) {
+ for (MCRegAliasIterator I(Src1->getReg(), TRI, true); I.isValid(); ++I)
+ MRI->clearKillFlags(*I);
+ }
SaveExecInstr.eraseFromParent();
VCmp.eraseFromParent();
diff --git a/llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx-set-kill.mir b/llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx-set-kill.mir
new file mode 100644
index 000000000000000..0b9f824775cc3c9
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx-set-kill.mir
@@ -0,0 +1,41 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1100 -run-pass si-optimize-exec-masking -verify-machineinstrs %s -o - | FileCheck %s
+
+---
+name: main
+tracksRegLiveness: true
+body: |
+ ; CHECK-LABEL: name: main
+ ; CHECK: bb.0:
+ ; CHECK-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
+ ; CHECK-NEXT: liveins: $vgpr0_vgpr1, $sgpr0_sgpr1_sgpr2_sgpr3
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: BUFFER_STORE_DWORDX2_OFFSET_exact $vgpr0_vgpr1, killed $sgpr0_sgpr1_sgpr2_sgpr3, 0, 344, 1, 0, implicit $exec
+ ; CHECK-NEXT: $sgpr0 = S_MOV_B32 $exec_lo
+ ; CHECK-NEXT: V_CMPX_EQ_U32_nosdst_e64 0, $vgpr0, implicit-def $exec, implicit $exec
+ ; CHECK-NEXT: S_CBRANCH_EXECZ %bb.2, implicit $exec
+ ; CHECK-NEXT: S_BRANCH %bb.1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1:
+ ; CHECK-NEXT: successors: %bb.2(0x80000000)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.2:
+ ; CHECK-NEXT: S_ENDPGM 0
+ bb.0:
+ successors: %bb.1, %bb.2
+ liveins: $vgpr0_vgpr1, $sgpr0_sgpr1_sgpr2_sgpr3
+
+ $vcc_lo = V_CMP_EQ_U32_e64 0, $vgpr0, implicit $exec
+ BUFFER_STORE_DWORDX2_OFFSET_exact killed $vgpr0_vgpr1, killed $sgpr0_sgpr1_sgpr2_sgpr3, 0, 344, 1, 0, implicit $exec
+ $sgpr0 = COPY $exec_lo, implicit-def $exec_lo
+ $sgpr0 = S_AND_B32 killed $sgpr0, killed $vcc_lo, implicit-def dead $scc
+ $exec_lo = S_MOV_B32_term killed $sgpr0
+ S_CBRANCH_EXECZ %bb.2, implicit $exec
+ S_BRANCH %bb.1
+
+ bb.1:
+
+ bb.2:
+ S_ENDPGM 0
+...
More information about the llvm-commits
mailing list