[libc-commits] [libc] [libc] Add memory fence utility to the GPU utilities (PR #91756)

Joseph Huber via libc-commits libc-commits at lists.llvm.org
Fri May 10 09:59:13 PDT 2024


================
@@ -140,6 +140,12 @@ LIBC_INLINE uint32_t get_lane_size() {
   __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "workgroup");
 }
 
+/// Waits for all pending memory operations to complete in program order.
+[[clang::convergent]] LIBC_INLINE void memory_fence() {
+  __builtin_amdgcn_s_waitcnt(0);
+  __builtin_amdgcn_fence(__ATOMIC_ACQ_REL, "");
----------------
jhuber6 wrote:

So this is what I was thinking myself, and was somewhat unsure of how a system / agent scope fence works in practice. Potentially we may be interested in measuring the latency of an RPC call, which in practice is a write from the GPU to some memory that resides on the host over PCI. I wasn't sure if a non-system scope fence would ensure that read / write finished.

https://github.com/llvm/llvm-project/pull/91756


More information about the libc-commits mailing list