[llvm] [AMDGPU][SIInsertWaitCnts] Gfx12.5 - Refactor xcnt optimization (PR #164357)

Ryan Mitchell via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 6 06:56:19 PST 2025


RyanRio wrote:

Is an xcnt necessary here?

```
; GFX1250-LABEL: flat_atomic_fadd_f32_noret_pat:
; GFX1250:       ; %bb.0:
; GFX1250-NEXT:    s_load_b64 s[0:1], s[4:5], 0x24
; GFX1250-NEXT:    v_dual_mov_b32 v0, 0 :: v_dual_mov_b32 v1, 4.0
; GFX1250-NEXT:    global_wb scope:SCOPE_SYS
; GFX1250-NEXT:    s_wait_storecnt 0x0
; GFX1250-NEXT:    s_wait_xcnt 0x0
; GFX1250-NEXT:    s_wait_kmcnt 0x0
; GFX1250-NEXT:    flat_atomic_add_f32 v0, v1, s[0:1] scope:SCOPE_SYS
; GFX1250-NEXT:    s_wait_storecnt_dscnt 0x0
; GFX1250-NEXT:    global_inv scope:SCOPE_SYS
; GFX1250-NEXT:    s_endpgm
```

https://github.com/llvm/llvm-project/pull/164357


More information about the llvm-commits mailing list