[PATCH] D120544: [AMDGPU] Omit unnecessary waitcnt before barriers

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 25 11:58:51 PST 2022


rampitec added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1143
   if (MI.getOpcode() == AMDGPU::S_BARRIER &&
-      !ST->hasAutoWaitcntBeforeBarrier()) {
+      !ST->hasAutoWaitcntBeforeBarrier() && !ST->supportsBackOffBarrier()) {
     Wait = Wait.combined(AMDGPU::Waitcnt::allZero(ST->hasVscnt()));
----------------
kerbowa wrote:
> arsenm wrote:
> > Is this really a distinct feature if it's the same check as auto waitcnt?
> It's not the same as auto waitcnt. There are distinct subtargets that support each feature. This check is just saying we don't need an explicit waitcnt before barriers under any circumstances if there is an implicit wait by HW.
> 
> We don't actually use the auto waitcnt subtarget feature since it can be configured dynamically by HW. I was debating removing it entirely.
But then waitcount is still needed after the barrier? I.e. it does not mean that barrier wait for all outstanding memory operations, right?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120544/new/

https://reviews.llvm.org/D120544



More information about the llvm-commits mailing list