[Openmp-commits] [PATCH] D94745: [OpenMP][WIP] Build the deviceRTLs with OpenMP instead of target dependent language

Tue Jan 19 16:39:59 PST 2021

JonChesterfield added inline comments.

================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu:21
+
+// FIXME: Forward declaration
+extern "C" {
----------------
JonChesterfield wrote:
> tianshilei1992 wrote:
> > JonChesterfield wrote:
> > > shouldn't these be in the cuda header above, and also in the clang-injected cuda headers?
> > All functions that can be called by CUDA are declared as `__device__`. In `declare target`, we cannot call those functions. Instead, we need them to be in a format of OpenMP, so those in `cuda.h` cannot be used.  If not those CUDA version macros, we can drop the header.
> I think the right answer to the cuda version macros is to compile this file in the deviceRTL twice, once for < 9000 and once for >9000. It seems reasonable to have a different implementation for the cuda API change. Clang knows what version it is compiling applications for so could pick the matching deviceRTL.bc.
> 
> That would let us totally decouple from cuda with some slightly ugly stuff like
> `return __nvvm_shfl_down_i32(Var, Delta, ((WARPSIZE - Width) << 8) | 0x1f);`
> as typeset in https://reviews.llvm.org/D94731?vs=316809&id=316820#toc
It's been pointed out to me that we already include ~4k of source at the top of source files that are compiled as openmp, even if they `#include` no header files. Mostly bits of libm. I'm not pleased to discover that, but it does mean that adding an implementation of `__kmpc_impl_activemask` etc to a new header won't change the status quo. Let's do that.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94745/new/

https://reviews.llvm.org/D94745