[PATCH] D89618: [AMDGPU] Optimize waitcnt insertion for flat memory operations

Tony Tye via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 19 18:48:50 PDT 2020


t-tye marked an inline comment as not done.
t-tye added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1207
+    unsigned AS = Memop->getAddrSpace();
+    if (AS != AMDGPUAS::LOCAL_ADDRESS)
+      return true;
----------------
rampitec wrote:
> It should check for local or flat here.
We do not want to test for local and flat as this method is checking to VMEM not LDS. Instead, we want to check for all the address spaces that are legal for a flat operation and involve VMEM. Looking at the enumeration of address spaces there are quite a few that are valid for flat that involve VMEM. Forxample, global, flat, contant, private, the ones involving buffers, etc. Since flat only supports LDS, FLAT, and the address spaces that involve VMEM the clearest test here is to find address spaces that are not LDS. They are the ones that may be VMEM.

I considered asserting if any address space was found that was not legal for a flat operation. For example region (GDS) is not valid. But is there an existing predicate to answer that question?


================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1248
+
+    if (TII->usesVM_CNT(Inst) && mayAccessVMEMThroughFlat(Inst)) {
+      ++FlatASCount;
----------------
rampitec wrote:
> nhaehnle wrote:
> > arsenm wrote:
> > > When is usesVM_CNT false and isFLAT true?
> > I'd like to know as well. It may be better to make this an `assert(TII->usesVM_CNT(Inst))` instead.
> Second to that, turn the check into assert. There are no such instructions at least so far.
This was in the original code but I agree it makes little sense. So moved it as an assert into mayAccessVMEMThroughFlat().


================
Comment at: llvm/test/CodeGen/AMDGPU/waitcnt.mir:66
 # CHECK: FLAT_LOAD_DWORD
-# GFX89: S_WAITCNT 112
+# GFX89: S_WAITCNT 3952
 # CHECK: FLAT_LOAD_DWORDX4
----------------
rampitec wrote:
> That one was not supposed to change? The pointer is flat here.
Yes. Previously it was "s_waitcnt vmcnt(0) lgkmcnt(0)". Now it is "s_waitcnt vmcnt(0)" as the address space of global16 is 1 which is GLOBAL. Therefore there is no need to wait on LGKM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89618/new/

https://reviews.llvm.org/D89618



More information about the llvm-commits mailing list