[libc-commits] [libc] [libc] Add memory fence utility to the GPU utilities (PR #91756)
Joseph Huber via libc-commits
libc-commits at lists.llvm.org
Fri May 10 09:59:13 PDT 2024
================
@@ -140,6 +140,12 @@ LIBC_INLINE uint32_t get_lane_size() {
__builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "workgroup");
}
+/// Waits for all pending memory operations to complete in program order.
+[[clang::convergent]] LIBC_INLINE void memory_fence() {
+ __builtin_amdgcn_s_waitcnt(0);
+ __builtin_amdgcn_fence(__ATOMIC_ACQ_REL, "");
----------------
jhuber6 wrote:
So this is what I was thinking myself, and was somewhat unsure of how a system / agent scope fence works in practice. Potentially we may be interested in measuring the latency of an RPC call, which in practice is a write from the GPU to some memory that resides on the host over PCI. I wasn't sure if a non-system scope fence would ensure that read / write finished.
https://github.com/llvm/llvm-project/pull/91756
More information about the libc-commits
mailing list