[all-commits] [llvm/llvm-project] 33a6ce: [HIP] Allow partial linking for `-fgpu-rdc` (#81700)

Yaxun (Sam) Liu via All-commits all-commits at lists.llvm.org
Thu Feb 22 10:51:42 PST 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 33a6ce18373ffd1457ebd54e930b6f02fe4c39c1
      https://github.com/llvm/llvm-project/commit/33a6ce18373ffd1457ebd54e930b6f02fe4c39c1
  Author: Yaxun (Sam) Liu <yaxun.liu at amd.com>
  Date:   2024-02-22 (Thu, 22 Feb 2024)

  Changed paths:
    M clang/lib/CodeGen/CGCUDANV.cpp
    M clang/lib/CodeGen/CodeGenModule.cpp
    M clang/lib/Driver/OffloadBundler.cpp
    M clang/lib/Driver/ToolChains/HIPUtility.cpp
    M clang/test/CMakeLists.txt
    M clang/test/CodeGenCUDA/device-stub.cu
    M clang/test/CodeGenCUDA/host-used-device-var.cu
    A clang/test/Driver/Inputs/hip.h
    M clang/test/Driver/clang-offload-bundler.c
    A clang/test/Driver/hip-partial-link.hip
    M clang/test/Driver/hip-toolchain-rdc.hip

  Log Message:
  -----------
  [HIP] Allow partial linking for `-fgpu-rdc` (#81700)

`-fgpu-rdc` mode allows device functions call device functions in
different TU. However, currently all device objects have to be linked
together since only one fat binary is supported. This is time consuming
for AMDGPU backend since it only supports LTO.

There are use cases that objects can be divided into groups in which
device functions are self-contained but host functions are not. It is
desirable to link/optimize/codegen the device code and generate a fatbin
for each group, whereas partially link the host code with `ld -r` or
generate a static library by using the `--emit-static-lib` option of
clang. This avoids linking all device code together, therefore decreases
the linking time for `-fgpu-rdc`.

Previously, clang emits an external symbol `__hip_fatbin` for all
objects for `-fgpu-rdc`. With this patch, clang emits an unique external
symbol `__hip_fatbin_{cuid}` for the fat binary for each object. When a
group of objects are linked together to generate a fatbin, the symbols
are merged by alias and point to the same fat binary. Each group has its
own fat binary. One executable or shared library can have multiple fat
binaries. Device linking is done for undefined fab binary symbols only
to avoid repeated linking. `__hip_gpubin_handle` is also uniquefied and
merged to avoid repeated registering. Symbol `__hip_cuid_{cuid}` is
introduced to facilitate debugging and tooling.

Fixes: https://github.com/llvm/llvm-project/issues/77018



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list