[Openmp-commits] [PATCH] D95313: [WIP] Move part of nvptx devicertl under clang
Jon Chesterfield via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Sun Jan 24 10:41:09 PST 2021
JonChesterfield added inline comments.
================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1204
+ {
+ auto *CTC = static_cast<const toolchains::CudaToolChain *>(
+ C.getSingleOffloadToolChain<Action::OFK_Cuda>());
----------------
Logic very like this could pick out a second, small devicertl bitcode library
================
Comment at: clang/lib/Headers/openmp_wrappers/__clang_openmp_devicertl_cuda_ge90.h:18
+#ifdef _OPENMP
+#define DEVICE __attribute__((used))
+#else
----------------
linkonce and linkonce_odr can both be discarded, but these symbols need to survive until devicertl is linked
================
Comment at: clang/lib/Headers/openmp_wrappers/__clang_openmp_devicertl_cuda_ge90.h:39
+ int WARPSIZE = 32;
+ int tmp = ((WARPSIZE - Width) << 8) | 0x1f;
+ return __nvvm_shfl_sync_down_i32(Mask, Var, Delta, tmp);
----------------
calling into intrinsics would remove this messy expression, if we can work out how to reliably compile parts of the cuda sdk as openmp
these functions probably can't be instantiated on the host, so when changing the devicertl build over to openmp we will also need to guard these with variant / macro
================
Comment at: openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip:49
// Warp vote function
-DEVICE __kmpc_impl_lanemask_t __kmpc_impl_activemask() {
+EXTERN __kmpc_impl_lanemask_t __kmpc_impl_activemask() {
return __builtin_amdgcn_read_exec();
----------------
Changing from c++ to c name mangling seems fine here. We don't consistently use one or the other in the devicertl. Using c mangling is simpler if the implementation is in a header.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D95313/new/
https://reviews.llvm.org/D95313
More information about the Openmp-commits
mailing list