[cfe-dev] linking relocatable device object code with clang

Tue Jun 26 15:39:21 PDT 2018

Dear clang developers,
thanks to the recent work by Alexey Bataev, Jonas Hahnfeld, and others, the
trunk version of clang includes support for compiling device code into
relocatable object files [1].

These object files can be linked with nvlink (once per GPU architecture),
combined with fatbin, embedded in a host object file, and linked with the
other host code by the host linker.
Usually nvcc can take care of this part - but it refuses to do so for
unsupported host compilers (gcc 8, clang 6).

It would be great if support for this "device link" step could be added to
the clang driver.
I am interested to work on it myself, but I would need some guidance on how
to start.

In the meantime, to show each step and validate that different approaches
are equivalent, I have adapted the original example by NVIDIA and set up an
example on GitHub at https://github.com/fwyzard/cuda-linking/ :

# clone the repository
git clone git at github.com:fwyzard/cuda-linking.git
cd cuda-linking

# build and link with nvcc
make clean nvcc
./app

# build with nvcc, link explicitly with nvlink/fatbin
make clean nvlink
./app

# build with clang, link explicitly with nvlink/fatbin
make clean clang
./app

Best regards,
.Andrea

[1] Separate Compilation and Linking of CUDA C++ Device Code,
https://devblogs.nvidia.com/separate-compilation-linking-cuda-device-code/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180627/f8575f50/attachment.html>