[PATCH] D96643: [AMDGPU] Limit memory scope for scratch, LDS and GDS
Tony Tye via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Feb 14 21:48:13 PST 2021
t-tye added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/memory-legalizer-local-agent.ll:726
; GFX10-WGP-NEXT: s_waitcnt lgkmcnt(0)
; GFX10-WGP-NEXT: s_waitcnt_vscnt null, 0x0
; GFX10-WGP-NEXT: buffer_gl0_inv
----------------
rampitec wrote:
> Looks like it is sill insufficient.
In WGP mode it is still necessary to synchronize across CUs, hence the need to wait/invalidate for VMEM when at workgroup scope. Basically, the scope limited to workgroup has to do the equivalent of agent scope.
In CU more this is not necessary and so this waitcnt vscnt /gl0 invalidate is not present as seen in line 738-739.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D96643/new/
https://reviews.llvm.org/D96643
More information about the llvm-commits
mailing list