[llvm] [AMDGPU] always emit a soft wait even if it is trivially ~0 (PR #147257)
Sameer Sahasrabuddhe via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 9 01:24:17 PDT 2025
================
@@ -669,6 +679,7 @@ define amdgpu_kernel void @global_volatile_store_1(
; GFX12-WGP-NEXT: s_wait_kmcnt 0x0
; GFX12-WGP-NEXT: s_wait_storecnt 0x0
; GFX12-WGP-NEXT: global_store_b32 v0, v1, s[0:1] scope:SCOPE_SYS
+; GFX12-WGP-NEXT: s_wait_loadcnt 0x3f
----------------
ssahasra wrote:
This is now cleaned up where possible. SIInsertWaitcnts will clean up any wait count at its maximum value even with optimizations enabled. But this is within the limitation that we can't simply assume that the legalizer had meant it to be a ~0. So instead we rely on simplifyWaitcnts() to do the correct thing by comparing scores.
https://github.com/llvm/llvm-project/pull/147257
More information about the llvm-commits
mailing list