[PATCH] D96643: [AMDGPU] Limit memory scope for scratch, LDS and GDS
    Tony Tye via Phabricator via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Sun Feb 14 21:54:30 PST 2021
    
    
  
t-tye added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/memory-legalizer-local-agent.ll:726
 ; GFX10-WGP-NEXT:    s_waitcnt lgkmcnt(0)
 ; GFX10-WGP-NEXT:    s_waitcnt_vscnt null, 0x0
 ; GFX10-WGP-NEXT:    buffer_gl0_inv
----------------
t-tye wrote:
> rampitec wrote:
> > Looks like it is sill insufficient.
> In WGP mode it is still necessary to synchronize across CUs, hence the need to wait/invalidate for VMEM when at workgroup scope. Basically, the scope limited to workgroup has to do the equivalent of agent scope.
> 
> In CU more this is not necessary and so this waitcnt vscnt /gl0 invalidate is not present as seen in line 738-739.
But this does demonstrate that simply treating WGP mode as agent scope is conservatively correct, but not optimal. I will look to see if can be improved further.
Repository:
  rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96643/new/
https://reviews.llvm.org/D96643
    
    
More information about the llvm-commits
mailing list