[llvm] [AMDGPU] Skip terminators when forcing emit zero flag (PR #112116)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 14 06:55:21 PDT 2024


================
@@ -0,0 +1,33 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -run-pass si-insert-waitcnts -amdgpu-waitcnt-forcezero=1 %s -o - | FileCheck %s
+
+---
+name: waitcnt-debug-non-first-terminators
+liveins:
+machineFunctionInfo:
+  isEntryFunction: true
+body:             |
+  ; CHECK-LABEL: name: waitcnt-debug-non-first-terminators
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_CBRANCH_SCC1 %bb.1, implicit $scc
+  ; CHECK-NEXT:   S_BRANCH %bb.2, implicit $scc
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_WAITCNT 0
----------------
arsenm wrote:

I see this version moves the waitcnt from the end of the block to the start of the successor blocks.

https://github.com/llvm/llvm-project/pull/112116


More information about the llvm-commits mailing list