[libclc] r315228 - Make ptx barrier work irrespective of the cl_mem_fence_flags
Jeroen Ketema via cfe-commits
cfe-commits at lists.llvm.org
Mon Oct 9 11:36:49 PDT 2017
Author: jketema
Date: Mon Oct 9 11:36:48 2017
New Revision: 315228
URL: http://llvm.org/viewvc/llvm-project?rev=315228&view=rev
Log:
Make ptx barrier work irrespective of the cl_mem_fence_flags
This generates a "bar.sync 0” instruction, which not only causes the
threads to wait, but does acts as a memory fence, as required by
OpenCL. The fence does not differentiate between local and global
memory. Unfortunately, there is no similar instruction which does
not include a memory fence. Hence, we cannot optimize the case
where neither CLK_LOCAL_MEM_FENCE nor CLK_GLOBAL_MEM_FENCE is
passed.
Modified:
libclc/trunk/ptx-nvidiacl/lib/synchronization/barrier.cl
Modified: libclc/trunk/ptx-nvidiacl/lib/synchronization/barrier.cl
URL: http://llvm.org/viewvc/llvm-project/libclc/trunk/ptx-nvidiacl/lib/synchronization/barrier.cl?rev=315228&r1=315227&r2=315228&view=diff
==============================================================================
--- libclc/trunk/ptx-nvidiacl/lib/synchronization/barrier.cl (original)
+++ libclc/trunk/ptx-nvidiacl/lib/synchronization/barrier.cl Mon Oct 9 11:36:48 2017
@@ -1,8 +1,6 @@
#include <clc/clc.h>
_CLC_DEF void barrier(cl_mem_fence_flags flags) {
- if (flags & CLK_LOCAL_MEM_FENCE) {
- __syncthreads();
- }
+ __syncthreads();
}
More information about the cfe-commits
mailing list