[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

Sun Jun 3 11:09:15 PDT 2018

Hahnfeld added a comment.

In https://reviews.llvm.org/D47394#1119489, @gtbercea wrote:

> In https://reviews.llvm.org/D47394#1119056, @Hahnfeld wrote:
>
> > Hmm, maybe the scope is much larger: I just tried linking an executable that references a `declare target` function in a shared library. My assumption was that this already works, given that `libomptarget`'s registration functions can be called multiple times. Am I doing something wrong?
>
>
> I believe this is a limitation coming from the Cuda toolchain. Not even nvcc supports this case: https://stackoverflow.com/questions/35897002/cuda-nvcc-building-chain-of-libraries

You are absolutely right, thanks for the link. Maybe we should also document somewhere that we don't support that either for OpenMP offloading to NVPTX?

I think this basically renders my approach useless as I meant to compile each device object file for offloading targets directly to a shared library. Those could have been put together at runtime by just loading (and registering) them in the right order. That way we would have been able to keep `clang-offload-bundler` in its current target agnostic form and didn't need to appease proprietary tools such as `nvlink`.

With that knowledge I see no other way than what this patch proposes. (I still don't particularly like it because it requires each toolchain to implement their own magic.) Sorry for the delay and my disagreement based on wrong assumptions that I wasn't able to verify as soon as I'd have liked to.

Repository:
  rC Clang

https://reviews.llvm.org/D47394