[llvm] [AMDGPU] Add another test for missing S_WAIT_XCNT (PR #161838)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 8 01:02:02 PDT 2025
================
@@ -945,6 +945,46 @@ body: |
$vgpr0 = V_MOV_B32_e32 0, implicit $exec
...
+# FIXME: Missing S_WAIT_XCNT before overwriting vgpr0.
+---
+name: wait_kmcnt_with_outstanding_vmem_2
+tracksRegLiveness: true
+machineFunctionInfo:
+ isEntryFunction: true
+body: |
+ ; GCN-LABEL: name: wait_kmcnt_with_outstanding_vmem_2
+ ; GCN: bb.0:
+ ; GCN-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000)
+ ; GCN-NEXT: liveins: $vgpr0_vgpr1, $sgpr0_sgpr1, $scc
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: $sgpr2 = S_LOAD_DWORD_IMM $sgpr0_sgpr1, 0, 0
+ ; GCN-NEXT: S_CBRANCH_SCC1 %bb.2, implicit $scc
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: bb.1:
+ ; GCN-NEXT: successors: %bb.2(0x80000000)
+ ; GCN-NEXT: liveins: $vgpr0_vgpr1, $sgpr2
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: $vgpr2 = GLOBAL_LOAD_DWORD $vgpr0_vgpr1, 0, 0, implicit $exec
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: bb.2:
+ ; GCN-NEXT: liveins: $sgpr2
+ ; GCN-NEXT: {{ $}}
+ ; GCN-NEXT: S_WAIT_KMCNT 0
+ ; GCN-NEXT: $sgpr2 = S_MOV_B32 $sgpr2
+ ; GCN-NEXT: $vgpr0 = V_MOV_B32_e32 0, implicit $exec
+ bb.0:
+ liveins: $vgpr0_vgpr1, $sgpr0_sgpr1, $scc
+ $sgpr2 = S_LOAD_DWORD_IMM $sgpr0_sgpr1, 0, 0
+ S_CBRANCH_SCC1 %bb.2, implicit $scc
+ bb.1:
+ liveins: $vgpr0_vgpr1, $sgpr2
+ $vgpr2 = GLOBAL_LOAD_DWORD $vgpr0_vgpr1, 0, 0, implicit $exec
+ bb.2:
----------------
easyonaadit wrote:
Unrelated to this particular test case,
Does the hardware insert an implicit Xcnt between VMEM and SMEM which span basic blocks?
In this case, does the `GLOBAL_LOAD` imply an implicit XCNT for the `S_LOAD` has been inserted?
What about cases where control flow doesn't end up in `bb.1`? Should the compiler be conservative and insert an Xcnt at the start of a basic block with outstanding VMEM/SMEM events?
https://github.com/llvm/llvm-project/pull/161838
More information about the llvm-commits
mailing list