[llvm] [AMDGPU] Insert waitcnt for non-global fence release in GFX12 (PR #159282)

Fabian Ritter via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 19 06:13:55 PDT 2025


================
@@ -810,14 +827,18 @@ define amdgpu_kernel void @agent_seq_cst_fence() {
 ;
 ; GFX12-WGP-LABEL: agent_seq_cst_fence:
 ; GFX12-WGP:       ; %bb.0: ; %entry
+; GFX12-WGP-NEXT:    s_wait_dscnt 0x0
 ; GFX12-WGP-NEXT:    s_endpgm
 ;
 ; GFX12-CU-LABEL: agent_seq_cst_fence:
 ; GFX12-CU:       ; %bb.0: ; %entry
+; GFX12-CU-NEXT:    s_wait_dscnt 0x0
 ; GFX12-CU-NEXT:    s_endpgm
 ;
 ; GFX1250-LABEL: agent_seq_cst_fence:
 ; GFX1250:       ; %bb.0: ; %entry
+; GFX1250-NEXT:    global_wb scope:SCOPE_DEV
----------------
ritter-x2a wrote:

Updated the PR so that no superfluous writebacks are introduced.

https://github.com/llvm/llvm-project/pull/159282


More information about the llvm-commits mailing list