[llvm] [AMDGPU][SIInsertWaitcnts] Do not add s_waitcnt when the counters are known to be 0 already (PR #72830)
Pierre van Houtryve via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 20 06:43:01 PST 2023
Juan Manuel MARTINEZ =?utf-8?q?CAAMAÑO?= <juamarti at amd.com>,pvanhout
<pierre.vanhoutryve at amd.com>
Message-ID:
In-Reply-To: <llvm.org/llvm/llvm-project/pull/72830 at github.com>
================
@@ -45,6 +45,7 @@ define void @back_off_barrier_no_fence(ptr %in, ptr %out) #0 {
; GFX11-BACKOFF-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; GFX11-BACKOFF-NEXT: flat_load_b32 v0, v[0:1]
; GFX11-BACKOFF-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX11-BACKOFF-NEXT: s_waitcnt_vscnt null, 0x0
----------------
Pierre-vh wrote:
This extra waitcnt is due to `setNonKernelFunctionInitialState`
Without this patch the counters start at
```
*** Block0 ***
VM_CNT(0):
LGKM_CNT(0):
EXP_CNT(0):
VS_CNT(0):
```
With the patch:
```
*** Block0 ***
VM_CNT(63):
LGKM_CNT(63):
EXP_CNT(7):
VS_CNT(63):
```
I didn't follow the full discussion around the patch so I don't know the context behind this change. I will try to dive deeper this week to provide more meaningful feedback.
https://github.com/llvm/llvm-project/pull/72830
More information about the llvm-commits
mailing list