[Openmp-commits] [PATCH] D95058: [OpenMP][NVPTX] Added forward declaration to pave the way for building deviceRTLs with OpenMP

Shilei Tian via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Wed Jan 20 09:36:42 PST 2021


tianshilei1992 created this revision.
tianshilei1992 added reviewers: jdoerfert, JonChesterfield.
Herald added subscribers: jfb, guansong, yaxunl.
tianshilei1992 requested review of this revision.
Herald added subscribers: openmp-commits, sstefan1.
Herald added a project: OpenMP.

Once we switch to build deviceRTLs with OpenMP, primitives and CUDA
intrinsics cannot be used directly anymore because `__device__` is not recognized
by OpenMP compiler. To avoid involving all CUDA internal headers we had in `clang`,
we forward declared these functions. Eventually they will be transformed into
right LLVM instrinsics.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D95058

Files:
  openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu


Index: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu
===================================================================
--- openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu
+++ openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu
@@ -17,6 +17,23 @@
 
 #include <cuda.h>
 
+// Forward declaration of CUDA primitives which will be evetually transformed
+// into LLVM intrinsics.
+extern "C" {
+unsigned int __activemask();
+unsigned int __ballot(unsigned);
+// The default argument here is based on NVIDIA's website
+// https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/
+int __shfl_sync(unsigned mask, int val, int src_line, int width = WARPSIZE);
+int __shfl(int val, int src_line, int width = WARPSIZE);
+int __shfl_down(int var, unsigned detla, int width);
+int __shfl_down_sync(unsigned mask, int var, unsigned detla, int width);
+void __syncwarp(int mask);
+void __threadfence();
+void __threadfence_block();
+void __threadfence_system();
+}
+
 DEVICE void __kmpc_impl_unpack(uint64_t val, uint32_t &lo, uint32_t &hi) {
   asm volatile("mov.b64 {%0,%1}, %2;" : "=r"(lo), "=r"(hi) : "l"(val));
 }


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D95058.317906.patch
Type: text/x-patch
Size: 1138 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20210120/8af24eed/attachment.bin>


More information about the Openmp-commits mailing list