[libclc] r327044 - amdgcn, popcount: Workaround broken llvm.ctpop intrinsic on some GCN ASICs

Jan Vesely via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 8 10:58:08 PST 2018


Author: jvesely
Date: Thu Mar  8 10:58:07 2018
New Revision: 327044

URL: http://llvm.org/viewvc/llvm-project?rev=327044&view=rev
Log:
amdgcn,popcount: Workaround broken llvm.ctpop intrinsic on some GCN ASICs

This is only really needed for VI+ ASICs. However, llvm would cast the value to
i32 for older asics anyway. The proper fix is in LLVM-7 (r326535).
Fixes CTS popcount on carrizo.

Reviewer: Aaron Watry <awatry at gmail.com>
Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>

Added:
    libclc/trunk/amdgcn/lib/integer/
    libclc/trunk/amdgcn/lib/integer/popcount.cl
    libclc/trunk/amdgcn/lib/integer/popcount.inc
Modified:
    libclc/trunk/amdgcn/lib/SOURCES

Modified: libclc/trunk/amdgcn/lib/SOURCES
URL: http://llvm.org/viewvc/llvm-project/libclc/trunk/amdgcn/lib/SOURCES?rev=327044&r1=327043&r2=327044&view=diff
==============================================================================
--- libclc/trunk/amdgcn/lib/SOURCES (original)
+++ libclc/trunk/amdgcn/lib/SOURCES Thu Mar  8 10:58:07 2018
@@ -1,4 +1,5 @@
 cl_khr_int64_extended_atomics/minmax_helpers.ll
+integer/popcount.cl
 math/ldexp.cl
 mem_fence/fence.cl
 synchronization/barrier.cl

Added: libclc/trunk/amdgcn/lib/integer/popcount.cl
URL: http://llvm.org/viewvc/llvm-project/libclc/trunk/amdgcn/lib/integer/popcount.cl?rev=327044&view=auto
==============================================================================
--- libclc/trunk/amdgcn/lib/integer/popcount.cl (added)
+++ libclc/trunk/amdgcn/lib/integer/popcount.cl Thu Mar  8 10:58:07 2018
@@ -0,0 +1,6 @@
+#include <clc/clc.h>
+#include <utils.h>
+#include <integer/popcount.h>
+
+#define __CLC_BODY "popcount.inc"
+#include <clc/integer/gentype.inc>

Added: libclc/trunk/amdgcn/lib/integer/popcount.inc
URL: http://llvm.org/viewvc/llvm-project/libclc/trunk/amdgcn/lib/integer/popcount.inc?rev=327044&view=auto
==============================================================================
--- libclc/trunk/amdgcn/lib/integer/popcount.inc (added)
+++ libclc/trunk/amdgcn/lib/integer/popcount.inc Thu Mar  8 10:58:07 2018
@@ -0,0 +1,17 @@
+_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE popcount(__CLC_GENTYPE x) {
+/* LLVM-4+ implements i16 ops for VI+ ASICs. However, ctpop implementation
+ * is missing until r326535. Therefore we have to convert sub i32 types to uint
+ * as a workaround. */
+#if __clang_major__ < 7 && __clang_major__ > 3 && __CLC_GENSIZE < 32
+	/* Prevent sign extension on uint conversion */
+	const __CLC_U_GENTYPE y = __CLC_XCONCAT(as_, __CLC_U_GENTYPE)(x);
+	/* Convert to uintX */
+	const __CLC_XCONCAT(uint, __CLC_VECSIZE) z = __CLC_XCONCAT(convert_uint, __CLC_VECSIZE)(y);
+	/* Call popcount on uintX type */
+	const __CLC_XCONCAT(uint, __CLC_VECSIZE) res = __clc_native_popcount(z);
+	/* Convert the result back to gentype. */
+	return __CLC_XCONCAT(convert_, __CLC_GENTYPE)(res);
+#else
+	return __clc_native_popcount(x);
+#endif
+}




More information about the cfe-commits mailing list