[PATCH] D89618: [AMDGPU] Optimize waitcnt insertion for flat memory operations

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 20 10:40:44 PDT 2020


rampitec added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1207
+    unsigned AS = Memop->getAddrSpace();
+    if (AS != AMDGPUAS::LOCAL_ADDRESS)
+      return true;
----------------
t-tye wrote:
> rampitec wrote:
> > It should check for local or flat here.
> We do not want to test for local and flat as this method is checking to VMEM not LDS. Instead, we want to check for all the address spaces that are legal for a flat operation and involve VMEM. Looking at the enumeration of address spaces there are quite a few that are valid for flat that involve VMEM. Forxample, global, flat, contant, private, the ones involving buffers, etc. Since flat only supports LDS, FLAT, and the address spaces that involve VMEM the clearest test here is to find address spaces that are not LDS. They are the ones that may be VMEM.
> 
> I considered asserting if any address space was found that was not legal for a flat operation. For example region (GDS) is not valid. But is there an existing predicate to answer that question?
Hm.. Right, it should return true for generic here. It seems the test (waitcnt.mir) itself does not test what is expected. So we need a test with a real flat pointer, load and full wait.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89618/new/

https://reviews.llvm.org/D89618



More information about the llvm-commits mailing list