[libclc] r315235 - Implement mem_fence on ptx

Jeroen Ketema via cfe-commits cfe-commits at lists.llvm.org
Mon Oct 9 12:43:04 PDT 2017


Author: jketema
Date: Mon Oct  9 12:43:04 2017
New Revision: 315235

URL: http://llvm.org/viewvc/llvm-project?rev=315235&view=rev
Log:
Implement mem_fence on ptx

PTX does not differentiate between read and write fences. Hence, these a
lowered to a mem_fence call. The mem_fence function compiles to the
“member.cta” instruction, which commits all outstanding reads and writes
of a thread such that these become visible to all other threads in the same
CTA (i.e., work-group). The instruction does not differentiate between
global and local memory. Hence, the flags parameter is ignored, except
for deciding whether a “member.cta” instruction should be issued at all.

Reviewed-by: Jan Vesely <jan.vesely at rutgers.edu>

Added:
    libclc/trunk/ptx-nvidiacl/lib/mem_fence/
    libclc/trunk/ptx-nvidiacl/lib/mem_fence/fence.cl
Modified:
    libclc/trunk/ptx-nvidiacl/lib/SOURCES

Modified: libclc/trunk/ptx-nvidiacl/lib/SOURCES
URL: http://llvm.org/viewvc/llvm-project/libclc/trunk/ptx-nvidiacl/lib/SOURCES?rev=315235&r1=315234&r2=315235&view=diff
==============================================================================
--- libclc/trunk/ptx-nvidiacl/lib/SOURCES (original)
+++ libclc/trunk/ptx-nvidiacl/lib/SOURCES Mon Oct  9 12:43:04 2017
@@ -1,3 +1,4 @@
+mem_fence/fence.cl
 synchronization/barrier.cl
 workitem/get_global_id.cl
 workitem/get_group_id.cl

Added: libclc/trunk/ptx-nvidiacl/lib/mem_fence/fence.cl
URL: http://llvm.org/viewvc/llvm-project/libclc/trunk/ptx-nvidiacl/lib/mem_fence/fence.cl?rev=315235&view=auto
==============================================================================
--- libclc/trunk/ptx-nvidiacl/lib/mem_fence/fence.cl (added)
+++ libclc/trunk/ptx-nvidiacl/lib/mem_fence/fence.cl Mon Oct  9 12:43:04 2017
@@ -0,0 +1,15 @@
+#include <clc/clc.h>
+
+_CLC_DEF void mem_fence(cl_mem_fence_flags flags) {
+   if (flags & (CLK_GLOBAL_MEM_FENCE | CLK_LOCAL_MEM_FENCE))
+     __nvvm_membar_cta();
+}
+
+// We do not have separate mechanism for read and write fences.
+_CLC_DEF void read_mem_fence(cl_mem_fence_flags flags) {
+  mem_fence(flags);
+}
+
+_CLC_DEF void write_mem_fence(cl_mem_fence_flags flags) {
+  mem_fence(flags);
+}




More information about the cfe-commits mailing list