[PATCH] D96643: [AMDGPU] Limit memory scope for scratch, LDS and GDS

Tony Tye via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Feb 14 21:54:30 PST 2021


t-tye added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/memory-legalizer-local-agent.ll:726
 ; GFX10-WGP-NEXT:    s_waitcnt lgkmcnt(0)
 ; GFX10-WGP-NEXT:    s_waitcnt_vscnt null, 0x0
 ; GFX10-WGP-NEXT:    buffer_gl0_inv
----------------
t-tye wrote:
> rampitec wrote:
> > Looks like it is sill insufficient.
> In WGP mode it is still necessary to synchronize across CUs, hence the need to wait/invalidate for VMEM when at workgroup scope. Basically, the scope limited to workgroup has to do the equivalent of agent scope.
> 
> In CU more this is not necessary and so this waitcnt vscnt /gl0 invalidate is not present as seen in line 738-739.
But this does demonstrate that simply treating WGP mode as agent scope is conservatively correct, but not optimal. I will look to see if can be improved further.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96643/new/

https://reviews.llvm.org/D96643



More information about the llvm-commits mailing list