[PATCH] D108291: [clang-nvlink-wrapper] Wrapper around nvlink for archive files

Tue Aug 24 00:35:54 PDT 2021

saiislam marked an inline comment as done.
saiislam added inline comments.

================
Comment at: clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp:19
+/// Such an archive is then passed to this tool to extract cubin files before
+/// passing to nvlink.
+///
----------------
ye-luo wrote:
> Right now clang-offload-bundler is only used to create an object file for the host and a cubin file for the device.
> Then cubin files are passed to the nvlink.
> This is different from what you described
> ```
> clang-offload-bundler creates a device specific archive of cubin files.
> Such an archive is then passed to this tool to extract cubin files before passing to nvlink.
> ```
> Is this caused by changes in https://reviews.llvm.org/D105191?
> Do you have any reading materials which documents the whole linking flow of D105191?
Yes, this patch is required for D105191 to work correctly on nvptx.
Once this patch lands, I will update D105191 to call "clang-nvlink-wrapper" instead of "nvlink" in clang/lib/Driver/ToolChains/Cuda.cpp::NVPTX::OpenMPLinker::ConstructJob().

Greg Rodgers [[ https://www.youtube.com/watch?v=3FsYwEhtCaM | presented about static device libraries ]] in last year's LLVM-CTH Workshop.

In summary, following commands are generated by clang driver to deal with heterogenous device libraries:
1. device-specific-archive.a <== clang-offload-bundler(heteregenous-device-archive.a, current-device)
2. If (amdgpu)
           linked-output <== llvm-link(device-specific-archive.a)
3. If (nvptx)
           extacted-cubins.cubin <== nvlink-wrapper(device-specific-archive.a)
           linked-output <== nvlink (extracted-cubins.cubin)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108291/new/

https://reviews.llvm.org/D108291