[Openmp-commits] [PATCH] D94745: [OpenMP][WIP] Build the deviceRTLs with OpenMP instead of target dependent language - NOT FOR REVIEW

Mon Jan 25 15:26:54 PST 2021

JonChesterfield added a comment.

Couple of minor suggestions inline, but overall this looks pretty good. Hopefully device-only compilation works already, the rest could be left for after this patch.

================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt:100
+                 -fopenmp -Xclang -fopenmp-is-device
+                 -D__CUDACC__
                  -I${devicertl_base_directory}
----------------
jdoerfert wrote:
> JonChesterfield wrote:
> > tianshilei1992 wrote:
> > > JonChesterfield wrote:
> > > > This is suspect - why does openmp want to claim to be cuda?
> > > To maintain minimal change. There is an include wrapped into a macro in `interface.h`. For AMD GPU, it includes one header in AMD implementation, and for CUDA device, it includes a header in NVPTX implementation.
> > Ah, that's probably my fault. May as well leave it for now.
> > 
> > I think we should expose a macro for openmp that indicates whether we're doing offloading to nvptx, or offloading to amdgpu, or just compiling for the host. Or, I think equivalently, replace some `#if` with variant.
> Please don't use defines if we have `begin/end declare variant` for it.
Variant sounds good. Should also be able to use `#ifdef __CUDA_ARCH__`, as amdgpu shouldn't be setting that

================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt:16
+# TODO: This part can also be removed if we can change the clang driver to make
+# it support device only compilation.
+if(CMAKE_HOST_SYSTEM_PROCESSOR MATCHES "x86_64")
----------------
>From @jdoerfert,

```-Xclang -fopenmp-is-device  -Xclang -emit-llvm-bc``` 
should do device-only

================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt:115

-    # Generate a Bitcode library for all the compute capabilities the user requested.
+    # This correlation is from clang/lib/Driver/ToolChains/Cuda.cpp.
+    # The last element is the default case.
----------------
s/correlation/correspondence, or maybe mapping

================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h:25
+#define NOINLINE __attribute__((noinline))
+#define ALIGN(N) __attribute__((aligned(N)))

----------------
Suggest we drop the DEVICE annotation and change ALIGN to alignas() or similar, but in a later patch. This one is already quite noisy.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94745/new/

https://reviews.llvm.org/D94745