[Libclc-dev] [PATCH] Implement mem_fence on ptx
Jan Vesely via Libclc-dev
libclc-dev at lists.llvm.org
Mon Oct 9 12:36:59 PDT 2017
On Mon, 2017-10-09 at 21:21 +0200, Jeroen Ketema via Libclc-dev wrote:
> Updated version of the patch, which does not emit a fence if neither
> CLK_GLOBAL_MEM_FENCE nor CLK_LOCAL_MEM_FENCE is
> passed via the flags parameter.
Can you include the explanation/description from v1 in the commit
message?
Reviewed-by: Jan Vesely <jan.vesely at rutgers.edu>
Jan
>
> Index: ptx-nvidiacl/lib/SOURCES
> ===================================================================
> --- ptx-nvidiacl/lib/SOURCES (revision 315193)
> +++ ptx-nvidiacl/lib/SOURCES (working copy)
> @@ -1,3 +1,4 @@
> +mem_fence/fence.cl
> synchronization/barrier.cl
> workitem/get_global_id.cl
> workitem/get_group_id.cl
> Index: ptx-nvidiacl/lib/mem_fence/fence.cl
> ===================================================================
> --- ptx-nvidiacl/lib/mem_fence/fence.cl (nonexistent)
> +++ ptx-nvidiacl/lib/mem_fence/fence.cl (working copy)
> @@ -0,0 +1,15 @@
> +#include <clc/clc.h>
> +
> +_CLC_DEF void mem_fence(cl_mem_fence_flags flags) {
> + if (flags & (CLK_GLOBAL_MEM_FENCE | CLK_LOCAL_MEM_FENCE))
> + __nvvm_membar_cta();
> +}
> +
> +// We do not have separate mechanism for read and write fences.
> +_CLC_DEF void read_mem_fence(cl_mem_fence_flags flags) {
> + mem_fence(flags);
> +}
> +
> +_CLC_DEF void write_mem_fence(cl_mem_fence_flags flags) {
> + mem_fence(flags);
> +}
>
> > On 8 Oct 2017, at 20:23, Jeroen Ketema via Libclc-dev <libclc-dev at lists.llvm.org> wrote:
> >
> > PTX does not differentiate between read and write fences. Hence, these a
> > lowered to a mem_fence call. The mem_fence function compiles to the
> > “member.cta” instruction, which commits all outstanding reads and writes
> > of a thread such that these become visible to all other threads in the same
> > CTA (i.e., work-group). The instruction does not differentiate between
> > global and local memory. Hence, the flags parameter is ignored.
> >
> > Index: ptx-nvidiacl/lib/SOURCES
> > ===================================================================
> > --- ptx-nvidiacl/lib/SOURCES (revision 315170)
> > +++ ptx-nvidiacl/lib/SOURCES (working copy)
> > @@ -1,3 +1,4 @@
> > +mem_fence/fence.cl
> > synchronization/barrier.cl
> > workitem/get_global_id.cl
> > workitem/get_group_id.cl
> > Index: ptx-nvidiacl/lib/mem_fence/fence.cl
> > ===================================================================
> > --- ptx-nvidiacl/lib/mem_fence/fence.cl (nonexistent)
> > +++ ptx-nvidiacl/lib/mem_fence/fence.cl (working copy)
> > @@ -0,0 +1,14 @@
> > +#include <clc/clc.h>
> > +
> > +_CLC_DEF void mem_fence(cl_mem_fence_flags flags) {
> > + __nvvm_membar_cta();
> > +}
> > +
> > +// We do not have separate mechanism for read and write fences
> > +_CLC_DEF void read_mem_fence(cl_mem_fence_flags flags) {
> > + mem_fence(flags);
> > +}
> > +
> > +_CLC_DEF void write_mem_fence(cl_mem_fence_flags flags) {
> > + mem_fence(flags);
> > +}
> >
> > _______________________________________________
> > Libclc-dev mailing list
> > Libclc-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
>
> _______________________________________________
> Libclc-dev mailing list
> Libclc-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20171009/571c6d06/attachment.sig>
More information about the Libclc-dev
mailing list