[libclc] r315228 - Make ptx barrier work irrespective of the cl_mem_fence_flags

Jeroen Ketema via cfe-commits cfe-commits at lists.llvm.org
Mon Oct 9 11:36:49 PDT 2017


Author: jketema
Date: Mon Oct  9 11:36:48 2017
New Revision: 315228

URL: http://llvm.org/viewvc/llvm-project?rev=315228&view=rev
Log:
Make ptx barrier work irrespective of the cl_mem_fence_flags

This generates a "bar.sync 0” instruction, which not only causes the
threads to wait, but does acts as a memory fence, as required by
OpenCL. The fence does not differentiate between local and global
memory. Unfortunately, there is no similar instruction which does
not include a memory fence. Hence, we cannot optimize the case
where neither CLK_LOCAL_MEM_FENCE nor CLK_GLOBAL_MEM_FENCE is
passed.

Modified:
    libclc/trunk/ptx-nvidiacl/lib/synchronization/barrier.cl

Modified: libclc/trunk/ptx-nvidiacl/lib/synchronization/barrier.cl
URL: http://llvm.org/viewvc/llvm-project/libclc/trunk/ptx-nvidiacl/lib/synchronization/barrier.cl?rev=315228&r1=315227&r2=315228&view=diff
==============================================================================
--- libclc/trunk/ptx-nvidiacl/lib/synchronization/barrier.cl (original)
+++ libclc/trunk/ptx-nvidiacl/lib/synchronization/barrier.cl Mon Oct  9 11:36:48 2017
@@ -1,8 +1,6 @@
 #include <clc/clc.h>
 
 _CLC_DEF void barrier(cl_mem_fence_flags flags) {
-  if (flags & CLK_LOCAL_MEM_FENCE) {
-    __syncthreads();
-  }
+  __syncthreads();
 }
 




More information about the cfe-commits mailing list