[PATCH] D37985: [AMDGPU] add LDS f32 intrinsics

Mon Sep 18 14:27:36 PDT 2017

arsenm added inline comments.

================
Comment at: include/llvm/IR/IntrinsicsAMDGPU.td:303
+class AMDGPUAtomicF32IntrinNORET : Intrinsic<[],
+    [LLVMQualPointerType<llvm_float_ty, 3>, llvm_float_ty],
+    [IntrArgMemOnly, NoCapture<0>, IntrNoReturn]
----------------
t-tye wrote:
> arsenm wrote:
> > dfukalov wrote:
> > > arsenm wrote:
> > > > Should this have an operand added for the ordering?
> > > No, these intrinsics are created by request to be able to generate ds_{add|min|max}[_rtn]_f32 in case of OpenCL local memory atomics only. They work only for pointers to floats located in addrspace 3
> > That doesn't change the ordering. Also needs an operand for volatile
> How are the memory ordering, memory scope and volatile carried through so that those fields can be set in the Machine Memory Operand? All these properties are needed to generate the correct waitcnt in the memory legalizer (see AMDGPUUsage.rst section on memory model).
Those are supposed to be handled by the intrinsic callbacks I mentioned that need to be implemented (at least for volatile). I'm not sure anything really correctly considers the atomic scope for possibly atomic intrinsics.

The most similar case we have is for amdgcn_atomic_inc/dec.

https://reviews.llvm.org/D37985