[PATCH] D14254: [OpenMP] Initial implementation of OpenMP offloading library - libomptarget device RTLs.

Fri Jan 12 09:52:45 PST 2018

tra added inline comments.

================
Comment at: libomptarget/deviceRTLs/nvptx/CMakeLists.txt:158
+      set(CUDA_ARCH "")
+      set(CUDA_ARCH --cuda-gpu-arch=sm_${LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITY})
+
----------------
gtbercea wrote:
> guansong wrote:
> > For cuda bc files, a CUDA install will have bc files for different arches, such as 
> > 
> > /usr/local/cuda-8.0/nvvm/libdevice/libdevice.compute_35.10.bc
> > /usr/local/cuda-8.0/nvvm/libdevice/libdevice.compute_30.10.bc
> > /usr/local/cuda-8.0/nvvm/libdevice/libdevice.compute_50.10.bc
> > /usr/local/cuda-8.0/nvvm/libdevice/libdevice.compute_20.10.bc
> > 
> > Should we consider to build different bc files for the end user? 
> > 
> What do you mean by that? 
> 
> Does this patch do what you mean: https://reviews.llvm.org/D41724 ?
That's no longer true for CUDA-9 -- it has a single bitcode file for all architectures.

Is OMP runtime for GPU going to provide anything that a) needs to have the same API for all GPUs and, b) has to be heavily GPU-specific under the hood?  If not, then one common library would probably suffice.

Repository:
  rL LLVM

https://reviews.llvm.org/D14254