[Libclc-dev] [PATCH 1/2] atomic: Add generic bitcode implementation of atomic_max

Aaron Watry awatry at gmail.com
Mon Sep 8 16:53:35 PDT 2014


On Mon, Sep 8, 2014 at 2:44 PM, Matt Arsenault <arsenm2 at gmail.com> wrote:
>
> On Sep 8, 2014, at 2:13 PM, Aaron Watry <awatry at gmail.com> wrote:
>
> On Mon, Sep 8, 2014 at 12:12 PM, Matt Arsenault <arsenm2 at gmail.com> wrote:
>
>
> On Sep 8, 2014, at 12:52 PM, Aaron Watry <awatry at gmail.com> wrote:
>
> From CL v1.2, Section 6.12.11:
> int atomic_max (volatile __global int *p, int val)
> unsigned int atomic_max (volatile __global unsigned int *p, unsigned int
> val)
> int atomic_max (volatile __local int *p, int val)
> unsigned int atomic_max (volatile __local unsigned int *p, unsigned int val)
>
> I'm assuming that the pointer declaration as volatile should translate
> directly to the statement in the bitcode.  Let me know if you think I
> mis-understood that.  If that's the case, then we also need to fix the
> atomic_add and atomic_sub functions.
>
>
> Yes, that’s how I would interpret it.
>
>
> And yes, there's a note that the atom_* functions from CL 1.0 are
> still supported. Those functions do NOT define the pointers as
> volatile.
>
>
> Yes, the sets of atomic functions are separate. It looks like currently
> atom_add is defined to be the same as atomic_add
>
>
> I'd tend to agree with that statement.
>
> Do you want any changes to this patch, or are we good?
>
>
> It’s fine, although it would make sense to add both the atomic_ and atom_
> versions at the same time
>
>

I'll do that...  Also, I'll be sending a v2 because while I was
attempting to get this working on evergreen, I noticed that this
treated all atomic_max as signed comparisons...  And the existing
piglit tests don't test that correctly... *facepalm*

I'll be updating the piglit tests, sending out a v2 of this, and I
just sent a patch to the llvm-commits list to add the
LDS_MAX_[U]INT[_RET] instructions for evergreen so that this will work
on more than just SI.  I've made sure to CC you and Tom on that one.

If someone has the ability to test/fix atomic_max for cayman (if
fixing is needed), that would probably also be a good thing.

--Aaron

>
> I'm currently trying to get atomic_max done because cppamp-driver-ng
> requires it, but eventually I'll probably get around to adding the
> other missing atomic_* functions and then the wrappers from CL1.0
> atom_* functions.
>
> --Aaron
>
>
>
>
> --Aaron
>
> --Aaron
>
>
> On Mon, Sep 8, 2014 at 11:41 AM, Matt Arsenault <arsenm2 at gmail.com> wrote:
>
>
> On Sep 8, 2014, at 12:31 PM, Aaron Watry <awatry at gmail.com> wrote:
>
> Not used yet...
>
> Signed-off-by: Aaron Watry <awatry at gmail.com>
> ---
> generic/include/clc/atomic/atomic_max.h |  3 +++
> generic/include/clc/clc.h               |  1 +
> generic/lib/atomic/atomic_impl.ll       | 12 ++++++++++++
> 3 files changed, 16 insertions(+)
> create mode 100644 generic/include/clc/atomic/atomic_max.h
>
> diff --git a/generic/include/clc/atomic/atomic_max.h
> b/generic/include/clc/atomic/atomic_max.h
> new file mode 100644
> index 0000000..30dc180
> --- /dev/null
> +++ b/generic/include/clc/atomic/atomic_max.h
> @@ -0,0 +1,3 @@
> +#define __CLC_FUNCTION atomic_max
> +#include <clc/atomic/atomic_decl.inc>
> +#undef __CLC_FUNCTION
> diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h
> index b8c1cb9..b492c54 100644
> --- a/generic/include/clc/clc.h
> +++ b/generic/include/clc/clc.h
> @@ -143,6 +143,7 @@
> #include <clc/atomic/atomic_add.h>
> #include <clc/atomic/atomic_dec.h>
> #include <clc/atomic/atomic_inc.h>
> +#include <clc/atomic/atomic_max.h>
> #include <clc/atomic/atomic_sub.h>
>
> /* cl_khr_global_int32_base_atomics Extension Functions */
> diff --git a/generic/lib/atomic/atomic_impl.ll
> b/generic/lib/atomic/atomic_impl.ll
> index 9df5b9f..4e228e8 100644
> --- a/generic/lib/atomic/atomic_impl.ll
> +++ b/generic/lib/atomic/atomic_impl.ll
> @@ -10,6 +10,18 @@ entry:
> ret i32 %0
> }
>
> +define i32 @__clc_atomic_max_addr1(i32 addrspace(1)* nocapture %ptr, i32
> %value) nounwind alwaysinline {
> +entry:
> +  %0 = atomicrmw volatile max i32 addrspace(1)* %ptr, i32 %value seq_cst
> +  ret i32 %0
> +}
>
>
> Why is it necessary to mark these with volatile (though I never remember
> whether it’s the atom_* or atomic_* functions which is supposed to be
> volatile)
>
>
> +
> +define i32 @__clc_atomic_max_addr3(i32 addrspace(3)* nocapture %ptr, i32
> %value) nounwind alwaysinline {
> +entry:
> +  %0 = atomicrmw volatile max i32 addrspace(3)* %ptr, i32 %value seq_cst
> +  ret i32 %0
> +}
> +
> define i32 @__clc_atomic_sub_addr1(i32 addrspace(1)* nocapture %ptr, i32
> %value) nounwind alwaysinline {
> entry:
> %0 = atomicrmw volatile sub i32 addrspace(1)* %ptr, i32 %value seq_cst
> --
> 1.9.1
>
>
> _______________________________________________
> Libclc-dev mailing list
> Libclc-dev at pcc.me.uk
> http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev
>
>




More information about the Libclc-dev mailing list