[PATCH] D19203: AMDGPU/SI: Add llvm.amdgcn.s.waitcnt.all intrinsic

Nicolai Hähnle via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 18 08:45:00 PDT 2016


nhaehnle added a comment.

Yes, we need consistency between all shader invocations, which can span all the CUs and SEs on the chip. There isn't really a notion of workgroups for GLSL graphics shaders. Basically, the instruction needs to make sure that all past memory writes by the shader (actually, only 'coherent' and 'volatile' ones) are visible to all other shaders. I'm not sure about what OpenCL needs.

With this patch, the idea is to implement this by setting glc=1 on the coherent/volatile writes and using a wait. I believe (but have not tried) that an alternative would be to always use glc=0 and wait + explicitly request an L1 cache flush at the memory barrier.

Tom, do you want the numeric counts as input, or just bits that indicate whether to wait for vm/exp/lgkm?


http://reviews.llvm.org/D19203





More information about the llvm-commits mailing list