[PATCH] D120544: [AMDGPU] Omit unnecessary waitcnt before barriers

Fri Feb 25 11:51:16 PST 2022

kerbowa added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:731
+  "true",
+  "Hardware supports backing off s_barrier if an exception occurs"
+>;
----------------
arsenm wrote:
> I don't understand what "backing off" means here
Ya, I cannot think of a great name for this. Suggestions are welcome.

The SPG talks about it in these terms, but also uses "early barrier exit ".

The idea is that HW may need to "leave but not satisfy the barrier" to handle address-watch or MEMVIOL returned by in-flight memory ops.

================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1143
   if (MI.getOpcode() == AMDGPU::S_BARRIER &&
-      !ST->hasAutoWaitcntBeforeBarrier()) {
+      !ST->hasAutoWaitcntBeforeBarrier() && !ST->supportsBackOffBarrier()) {
     Wait = Wait.combined(AMDGPU::Waitcnt::allZero(ST->hasVscnt()));
----------------
arsenm wrote:
> Is this really a distinct feature if it's the same check as auto waitcnt?
It's not the same as auto waitcnt. There are distinct subtargets that support each feature. This check is just saying we don't need an explicit waitcnt before barriers under any circumstances if there is an implicit wait by HW.

We don't actually use the auto waitcnt subtarget feature since it can be configured dynamically by HW. I was debating removing it entirely.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120544/new/

https://reviews.llvm.org/D120544