[Libclc-dev] [PATCH 3/3] amdgcn, popcount: Workaround broken llvm.ctpop intrinsic on some GCN ASICs
Aaron Watry via Libclc-dev
libclc-dev at lists.llvm.org
Wed Mar 7 18:56:37 PST 2018
This series looks good to me.
Also tested successfully on Polaris on latest LLVM 7.0.
--Aaron
On Sat, Mar 3, 2018 at 3:44 PM, Jan Vesely via Libclc-dev
<libclc-dev at lists.llvm.org> wrote:
> This is only really needed for VI+ ASICs. However, llvm would cast the value to
> i32 for older asics anyway. The proper fix is in LLVM-7 (r326535).
> Fixes CTS popcount on carrizo.
>
> Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> ---
> amdgcn/lib/SOURCES | 1 +
> amdgcn/lib/integer/popcount.cl | 6 ++++++
> amdgcn/lib/integer/popcount.inc | 17 +++++++++++++++++
> 3 files changed, 24 insertions(+)
> create mode 100644 amdgcn/lib/integer/popcount.cl
> create mode 100644 amdgcn/lib/integer/popcount.inc
>
> diff --git a/amdgcn/lib/SOURCES b/amdgcn/lib/SOURCES
> index 8e14ce2..6a5ce00 100644
> --- a/amdgcn/lib/SOURCES
> +++ b/amdgcn/lib/SOURCES
> @@ -1,4 +1,5 @@
> cl_khr_int64_extended_atomics/minmax_helpers.ll
> +integer/popcount.cl
> math/ldexp.cl
> mem_fence/fence.cl
> synchronization/barrier.cl
> diff --git a/amdgcn/lib/integer/popcount.cl b/amdgcn/lib/integer/popcount.cl
> new file mode 100644
> index 0000000..ebd167d
> --- /dev/null
> +++ b/amdgcn/lib/integer/popcount.cl
> @@ -0,0 +1,6 @@
> +#include <clc/clc.h>
> +#include <utils.h>
> +#include <integer/popcount.h>
> +
> +#define __CLC_BODY "popcount.inc"
> +#include <clc/integer/gentype.inc>
> diff --git a/amdgcn/lib/integer/popcount.inc b/amdgcn/lib/integer/popcount.inc
> new file mode 100644
> index 0000000..402ddb7
> --- /dev/null
> +++ b/amdgcn/lib/integer/popcount.inc
> @@ -0,0 +1,17 @@
> +_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE popcount(__CLC_GENTYPE x) {
> +/* LLVM-4+ implements i16 ops for VI+ ASICs. However, ctpop implementation
> + * is missing until r326535. Therefore we have to convert sub i32 types to uint
> + * as a workaround. */
> +#if __clang_major__ < 7 && __clang_major__ > 3 && __CLC_GENSIZE < 32
> + /* Prevent sign extension on uint conversion */
> + const __CLC_U_GENTYPE y = __CLC_XCONCAT(as_, __CLC_U_GENTYPE)(x);
> + /* Convert to uintX */
> + const __CLC_XCONCAT(uint, __CLC_VECSIZE) z = __CLC_XCONCAT(convert_uint, __CLC_VECSIZE)(y);
> + /* Call popcount on uintX type */
> + const __CLC_XCONCAT(uint, __CLC_VECSIZE) res = __clc_native_popcount(z);
> + /* Convert the result back to gentype. */
> + return __CLC_XCONCAT(convert_, __CLC_GENTYPE)(res);
> +#else
> + return __clc_native_popcount(x);
> +#endif
> +}
> --
> 2.14.3
>
> _______________________________________________
> Libclc-dev mailing list
> Libclc-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
More information about the Libclc-dev
mailing list