[llvm-dev] Executing OpenMP 4.0 code on Nvidia's GPU

Thu Jan 21 01:49:00 PST 2016

Thanks Arpith.

I was doing it in almost the same way but with nvcc (or apc-llc
<https://github.com/apc-llc/nvcc-llvm-ir>), and of course I had to make the
produced LLVM-IR matches my version of LLVM.

But, I would imagine it will less messy if I can compile CUDA GPU OMP
runtime with Clang directly. I found that there is a patch that was
committed recently to enable compiling CUDA with clang (
http://llvm.org/docs/CompileCudaWithLLVM.html).

Do you know if there is any restriction about the CUDA version for the
compilation of CUDA with clang to work ?

Thanks a lot

On Wed, Jan 20, 2016 at 7:07 AM, Arpith C Jacob <acjacob at us.ibm.com> wrote:

> Hi Ahmed,
>
> I am experimenting with LTO, but as you said, it's still *very* hacky.
>
> Here's what I did. First compile the CUDA GPU OMP runtime with Clang
> (rather than nvcc) to bitcode. When I looked at Clang-CUDA a couple of
> weeks ago I could only get device side bitcode by using the temporary files
> generated after passing -save-temps to Clang. The OMP-GPU version of LLVM
> that you are using is not up to date with trunk, so I had to do a bit of
> massaging on the generated IR.
>
> I then had to manually link the various device side bitcodes, call opt,
> llc, ptxas, and finally link it with the host object file.
>
> We don't have support for this in the driver as yet but once we move to
> trunk I will look into streamlining this.
>
> Thanks,
> Arpith
>
> [image: Inactive hide details for Ahmed ElTantawy ---01/20/2016 08:44:38
> AM---Hi, I see now that the linking happens at the binary leve]Ahmed
> ElTantawy ---01/20/2016 08:44:38 AM---Hi, I see now that the linking
> happens at the binary level. I was wondering
>
> From: Ahmed ElTantawy <ahmede at ece.ubc.ca>
> To: Arpith C Jacob/Watson/IBM at IBMUS
> Cc: llvm-dev at lists.llvm.org, "Bataev, Alexey" <alexey.bataev at intel.com>
> Date: 01/20/2016 08:44 AM
> Subject: Re: Executing OpenMP 4.0 code on Nvidia's GPU
> Sent by: ahmed.mohammed.eltantawy at gmail.com
> ------------------------------
>
>
>
> Hi,
>
> I see now that the linking happens at the binary level. I was wondering
> whether it is possible to link to the OpenMP runtime library at the LLVM IR
> level (to enable LTO optimizations for the code after library calls has
> been replaced).
>
> I have done this before by linking to the bitcode of a file that contains
> the compiled CUDA implementation of the OpenMP runtime library. But it was
> a bit hacky, and offloading was not supported yet. Is it there a
> cleaner/standard way to do this ?
>
> Thanks.
>
> On Wed, Jan 20, 2016 at 5:09 AM, Ahmed ElTantawy <*ahmede at ece.ubc.ca*
> <ahmede at ece.ubc.ca>> wrote:
>
>    Hi Arpith,
>
>    That is exactly what it is :).
>
>    My bad, I thought I copied over the libraries to where LIBRARY_PATH
>    pointing but apparently it was copied to a wrong destination.
>
>    Thanks a lot.
>
>    On Wed, Jan 20, 2016 at 4:51 AM, Arpith C Jacob <*acjacob at us.ibm.com*
>    <acjacob at us.ibm.com>> wrote:
>    Hi Ahmed,
>
>    nvlink is unable to find the GPU OMP runtime library in its path. Does
>    LIBRARY_PATH point to the right location? You could try passing the "-v"
>    option to clang to get more information.
>
>    Regards,
>    Arpith
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160121/07f504eb/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160121/07f504eb/attachment.gif>