[llvm] a746594 - [AMDGPU] Skip terminators when forcing emit zero flag (#112116)

via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 14 08:46:21 PDT 2024


Author: Shilei Tian
Date: 2024-10-14T11:46:18-04:00
New Revision: a74659445dc33a495345c2418e8d01811a1627c7

URL: https://github.com/llvm/llvm-project/commit/a74659445dc33a495345c2418e8d01811a1627c7
DIFF: https://github.com/llvm/llvm-project/commit/a74659445dc33a495345c2418e8d01811a1627c7.diff

LOG: [AMDGPU] Skip terminators when forcing emit zero flag (#112116)

When forcing emit zero, we need to skip terminators of a MBB; otherwise
the terminator list of the MBB would be broken.

Added: 
    llvm/test/CodeGen/AMDGPU/waitcnt-debug-non-first-terminators.mir

Modified: 
    llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 1070c14f7a987f..8f1757db8a85f5 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -1824,7 +1824,9 @@ bool SIInsertWaitcnts::generateWaitcntInstBefore(MachineInstr &MI,
   // Verify that the wait is actually needed.
   ScoreBrackets.simplifyWaitcnt(Wait);
 
-  if (ForceEmitZeroFlag)
+  // When forcing emit, we need to skip terminators because that would break the
+  // terminators of the MBB if we emit a waitcnt between terminators.
+  if (ForceEmitZeroFlag && !MI.isTerminator())
     Wait = WCG->getAllZeroWaitcnt(/*IncludeVSCnt=*/false);
 
   if (ForceEmitWaitcnt[LOAD_CNT])

diff  --git a/llvm/test/CodeGen/AMDGPU/waitcnt-debug-non-first-terminators.mir b/llvm/test/CodeGen/AMDGPU/waitcnt-debug-non-first-terminators.mir
new file mode 100644
index 00000000000000..bccf266117a7d2
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/waitcnt-debug-non-first-terminators.mir
@@ -0,0 +1,33 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -run-pass si-insert-waitcnts -amdgpu-waitcnt-forcezero=1 %s -o - | FileCheck %s
+
+---
+name: waitcnt-debug-non-first-terminators
+liveins:
+machineFunctionInfo:
+  isEntryFunction: true
+body:             |
+  ; CHECK-LABEL: name: waitcnt-debug-non-first-terminators
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_SCC1 %bb.1, implicit $scc
+  ; CHECK-NEXT:   S_BRANCH %bb.2, implicit $scc
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_WAITCNT 0
+  ; CHECK-NEXT:   S_NOP 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   S_WAITCNT 0
+  ; CHECK-NEXT:   S_NOP 0
+  bb.0:
+    S_CBRANCH_SCC1 %bb.1, implicit $scc
+    S_BRANCH %bb.2, implicit $scc
+  bb.1:
+    S_NOP 0
+  bb.2:
+    S_NOP 0
+...


        


More information about the llvm-commits mailing list