[all-commits] [llvm/llvm-project] dc6a0b: [HIP] Align device binary

Yaxun (Sam) Liu via All-commits all-commits at lists.llvm.org
Fri Oct 2 15:12:19 PDT 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: dc6a0b0ec7e3d72a4cc849af4e4aa6c6a29a53d2
      https://github.com/llvm/llvm-project/commit/dc6a0b0ec7e3d72a4cc849af4e4aa6c6a29a53d2
  Author: Yaxun (Sam) Liu <yaxun.liu at amd.com>
  Date:   2020-10-02 (Fri, 02 Oct 2020)

  Changed paths:
    M clang/lib/CodeGen/CGCUDANV.cpp
    M clang/lib/Driver/ToolChains/HIP.cpp
    M clang/test/CodeGenCUDA/device-stub.cu
    M clang/test/Driver/clang-offload-bundler.c
    M clang/test/Driver/hip-toolchain-no-rdc.hip
    M clang/test/Driver/hip-toolchain-rdc.hip
    M clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp

  Log Message:
  -----------
  [HIP] Align device binary

To facilitate faster loading of device binaries and share them among processes,
HIP runtime favors their alignment being 4096 bytes. HIP runtime can load
unaligned device binaries, however, aligning them at 4096 bytes results in
faster loading and less shared memory usage.

This patch adds an option -bundle-align to clang-offload-bundler which allows
bundles to be aligned at specified alignment. By default it is 1, which is NFC
compared to existing format.

This patch then aligns embedded fat binary and device binary inside fat binary
at 4096 bytes.

It has been verified this change does not cause significant overall file size increase
for typical HIP applications (less than 1%).

Differential Revision: https://reviews.llvm.org/D88734




More information about the All-commits mailing list