[Openmp-dev] Target construct not offloading to GPU
Jonas Hahnfeld via Openmp-dev
openmp-dev at lists.llvm.org
Fri Oct 5 07:42:23 PDT 2018
Yes, now you need to pass
LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=35,60,70 (or whatever you like
to have) to get the runtime libraries.
On 2018-10-05 16:33, Cristobal Ortega wrote:
> I compiled clang with the following line:
> cmake .. -DCMAKE_C_COMPILER=${HOST_GCC}/bin/gcc
> -DCMAKE_CXX_COMPILER=${HOST_GCC}/bin/g++
> -DGCC_INSTALL_PREFIX=${HOST_GCC}
> -DCMAKE_CXX_LINK_FLAGS="-L${HOST_GCC}/lib64
> -Wl,-rpath,${HOST_GCC}/lib64"
> -DCMAKE_INSTALL_PREFIX=/gpfs/projects/bsc18/bsc18833/pkg/clang/7.0.0
> -DGCC_INSTALL_PREFIX=${HOST_GCC}
>
> Indeed, output with verbose confirms that clang is trying to compile
> with march=sm_35 (output is attached).
> Also, trying to compile the program with
> "-Xopenmp-target -march=sm_70"
> fails with
> "clang-7: error: nvlink command failed with exit code 255 (use -v to
> see invocation)" because of several undefined references (details in
> the attached file).
>
> So, I'm trying to re-compile clang with
> CLANG_OPENMP_NVPTX_DEFAULT_ARCH but, still, clang is not generating
> the library 'libomptarget-nvptx-sm_70.bc'. Therefore, compilation
> doesn't complete.
> Where should this library be? I have one bc file in
> "clang_src/test/Driver/Inputs/libomptarget/" but it's for sm_20
> (libomptarget-nvptx-sm_20.bc).
That's an empty file for testing. To get Bitcode libraries, you need to
compile the OpenMP project using Clang.
(I've started putting together step-by-step instructions how to build
LLVM/Clang 7.0 for OpenMP offloading, I'll send a link to the mailing
list once ready.)
> This is how I'm trying to compile clang:
> cmake .. -DCMAKE_C_COMPILER=${HOST_GCC}/bin/gcc
> -DCMAKE_CXX_COMPILER=${HOST_GCC}/bin/g++
> -DGCC_INSTALL_PREFIX=${HOST_GCC} -DCLANG_OPENMP_NVPTX_DEFAULT_ARCH=70
Nit: Should this be -DCLANG_OPENMP_NVPTX_DEFAULT_ARCH=sm_70?
> Yet, in the compilation process, clang complains about the missing
> library for sm_70.
>
> Do I need to pass some flag to LLVM too?
>
> Best,
> -Cristobal
>
>
>
> On 10/05/2018 03:22 PM, Jonas Hahnfeld wrote:
>> Hi,
>>
>> how did you build your compiler? If you didn't specify
>> CLANG_OPENMP_NVPTX_DEFAULT_ARCH Clang will default to sm_35 which
>> doesn't run on Volta (sm_70).
>> Can you post the output of
>>> clang -v -o openmp_offload openmp_offload.c -O3 -fopenmp=libomp
>>> -fopenmp-targets="nvptx64-nvidia-cuda"
>>
>> If it's indeed compiling for sm_35, can you try adding -Xopenmp-target
>> -march=sm_70?
>>
>> Regards,
>> Jonas
>>
>> On 2018-10-05 15:09, Cristobal Ortega via Openmp-dev wrote:
>>> Hello,
>>>
>>> I've been trying to compile a program (source code is attached) that
>>> offloads to a NVIDIA V-100 GPU with LLVM 7.0 and clang 7.0.
>>>
>>> It seems that the program is successfully compiled, yet nvprof
>>> reports
>>> that "no kernels were profiled".
>>> The application seems that is running on the CPU (as "top" command
>>> reports a high usage of CPUs).
>>>
>>> Compilation line that I used:
>>> clang -v -o openmp_offload openmp_offload.c -O3 -fopenmp=libomp
>>> -fopenmp-targets="nvptx64-nvidia-cuda"
>>>
>>> Output after executing the binary:
>>> ==74802== NVPROF is profiling process 74802, command:
>>> ./openmp_offload
>>> 10 10 10000 1
>>> Number of processors: 160
>>> Number of devices: 4
>>> Default device: 0
>>> Is initial device: 1
>>> ==74802== Profiling application: ./openmp_offload 10 10 10000 1
>>> ==74802== Profiling result:
>>> No kernels were profiled.
>>> Type Time(%) Time Calls Avg Min
>>> Max Name
>>> API calls: 99.99% 311.50ms 1 311.50ms 311.50ms
>>> 311.50ms cuCtxCreate
>>> 0.00% 11.462us 4 2.8650us 1.1450us
>>> 6.2010us cuDeviceGetPCIBusId
>>> 0.00% 5.4850us 5 1.0970us 387ns
>>> 3.7770us cuDeviceGet
>>> 0.00% 4.8070us 12 400ns 232ns
>>> 1.0350us cuDeviceGetAttribute
>>> 0.00% 1.4360us 3 478ns 384ns
>>> 640ns cuDeviceGetCount
>>>
>>>
>>>
>>> When compiled with GCC, the application does the offloading to the
>>> GPU.
>>>
>>> clang information:
>>> $ clang -v
>>> Version 6
>>> Version >= 90 selected
>>> libdevice.10.bc exists
>>> clang version 7.0.0 (tags/RELEASE_700/final)
>>> Target: powerpc64le-unknown-linux-gnu
>>> Thread model: posix
>>> InstalledDir: /gpfs/projects/bsc18/bsc18833/pkg/clang/7.0.0/bin
>>> Found candidate GCC installation:
>>> /home/user/pkg/gcc/8.2.0/lib/gcc/powerpc64le-unknown-linux-gnu/8.2.0
>>> Selected GCC installation:
>>> /home/user/pkg/gcc/8.2.0/lib/gcc/powerpc64le-unknown-linux-gnu/8.2.0
>>> Candidate multilib: .;@m64
>>> Selected multilib: .;@m64
>>> Found CUDA installation: /usr/local/cuda-9.2, version 9.2
>>>
>>>
>>> Hopefully somebody has an idea on what's going on here.
>>> If you need any more information to find the issue, let me know.
>>> Thank you.
>>>
>>> Best,
>>> -Cristobal
>>>
>>>
>>> http://bsc.es/disclaimer
>>> _______________________________________________
>>> Openmp-dev mailing list
>>> Openmp-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
>
>
>
> http://bsc.es/disclaimer
More information about the Openmp-dev
mailing list