[PATCH] D128914: [HIP] Add support for handling HIP in the linker wrapper

Joseph Huber via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Jul 11 17:40:45 PDT 2022


jhuber6 added a comment.

In D128914#3643802 <https://reviews.llvm.org/D128914#3643802>, @tra wrote:

> For what it's worth, NCCL <https://developer.nvidia.com/nccl> is the only nontrivial library that needs RDC compilation that I'm aware of.
> It's also self-contained for RDC purposes we only need to use RDC on the library TUs and do not need to propagate it to all CUDA TUs in the build.
>
> I believe such 'constrained' RDC compilation will likely be the reasonable practical trade-off. It may not become the default compilation mode, but we should be able to control where the "fully linked GPU executable" boundary is and it's not necessarily going to match the fully-linked host executable.

Theoretically we could do this with a relocatable link using the linker-wrapper. The only problem with this approach are the `__start/__stop` linker defined variables that we use to iterate the globals to be registered as these are tied to the section specifically. Potentially, we could move these to a unique section so they don't interfere with anything. So it would be something like this

  clang-linker-wrapper -r a.o b.o c.o -o registered.o // Contains RTL calls to register all globals at section 'cuda_offloading_entries_<ID>'
  llvm-strip ---remove-section .llvm.offloading registered.o // Remove embedded IR so no other files will link against it
  llvm-objcopy --rename-section cuda_offloading_entries=cuda_offloading_entries_<ID> registered.o // Change the registration section to something unique

Think this would work?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128914/new/

https://reviews.llvm.org/D128914



More information about the cfe-commits mailing list