[PATCH] D96643: [AMDGPU] Limit memory scope for scratch, LDS and GDS

Tony Tye via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Feb 14 21:48:13 PST 2021


t-tye added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/memory-legalizer-local-agent.ll:726
 ; GFX10-WGP-NEXT:    s_waitcnt lgkmcnt(0)
 ; GFX10-WGP-NEXT:    s_waitcnt_vscnt null, 0x0
 ; GFX10-WGP-NEXT:    buffer_gl0_inv
----------------
rampitec wrote:
> Looks like it is sill insufficient.
In WGP mode it is still necessary to synchronize across CUs, hence the need to wait/invalidate for VMEM when at workgroup scope. Basically, the scope limited to workgroup has to do the equivalent of agent scope.

In CU more this is not necessary and so this waitcnt vscnt /gl0 invalidate is not present as seen in line 738-739.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96643/new/

https://reviews.llvm.org/D96643



More information about the llvm-commits mailing list