[PATCH] D39060: AMDGPU: Lower buffer store and atomic intrinsics manually

Mon Oct 23 20:44:46 PDT 2017

t-tye added inline comments.

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4566-4568
+      MachineMemOperand::MOStore |
+      MachineMemOperand::MODereferenceable |
+      MachineMemOperand::MOVolatile,
----------------
nhaehnle wrote:
> arsenm wrote:
> > nhaehnle wrote:
> > > Unfortunately, there's not a lot of documentation on MOVolatile, but I suspect this should not be set at least when GLC == SLC == 0. And I image that that would fix the issue with D39012 as well... (which means the order of patches should be reversed).
> > You should not be setting MOVolatile out of nowhere. Adding that defeats what you are trying to accomplish. I also think we aren't setting volatile directly to GLC and the memory legalizer pass is supposed to set GLC.
> You're right, MOVolatile should be unnecessary even with GLC. I was thinking of GLSL writes to coherent buffer objects, but those still need memoryBarrier()s for guaranteed ordering. So I agree, buffer stores should never be MOVolatile.
MMO now have atomic memory ordering and memory scope that convey how atomics are required to be coherent. The memory legalizer pass uses this information to set glc bit, generate appropriate watcnt, and cache invalidate instructions.

These are separate from the volatile property which has a different purpose.

So if the goal is to request atomic coherence (release/acquire memory model semantics) shouldn't the MMO memory ordering/scope be set correctly?

https://reviews.llvm.org/D39060