[PATCH] D96643: [AMDGPU] Limit memory scope for scratch, LDS and GDS
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 15 11:00:17 PST 2021
rampitec added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/memory-legalizer-local-agent.ll:726
; GFX10-WGP-NEXT: s_waitcnt lgkmcnt(0)
; GFX10-WGP-NEXT: s_waitcnt_vscnt null, 0x0
; GFX10-WGP-NEXT: buffer_gl0_inv
----------------
t-tye wrote:
> t-tye wrote:
> > rampitec wrote:
> > > Looks like it is sill insufficient.
> > In WGP mode it is still necessary to synchronize across CUs, hence the need to wait/invalidate for VMEM when at workgroup scope. Basically, the scope limited to workgroup has to do the equivalent of agent scope.
> >
> > In CU more this is not necessary and so this waitcnt vscnt /gl0 invalidate is not present as seen in line 738-739.
> But this does demonstrate that simply treating WGP mode as agent scope is conservatively correct, but not optimal. I will look to see if can be improved further.
I do not see it in the memory model for GFX10. There is no reference that vmem counter is needed in WGP mode.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D96643/new/
https://reviews.llvm.org/D96643
More information about the llvm-commits
mailing list