[PATCH] D133726: [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU

Joseph Huber via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Sep 12 16:22:25 PDT 2022


jhuber6 added a comment.

In D133726#3785040 <https://reviews.llvm.org/D133726#3785040>, @JonChesterfield wrote:

> We can do this but should expect an increase in code size from having multiple internalised copies of the same function. There may be an incidental benefit if we can specialise some functions to the call site without additional cloning. Address of the same functions from different TUs will be inequal, which is wrong, but probably doesn't matter in practice.
>
> It does have the major advantage that mlink-builtin-bitcode patches up the invalid IR on the fly, which is likely easier than changing the device libs or making IR mcpu-agnostic.

It will probably decrease code size in the final executable now that this will forcefully internalize all the `protected` functions in `ocml.bc` that were sticking around because LTO couldn't remove them due to visibility.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133726/new/

https://reviews.llvm.org/D133726



More information about the cfe-commits mailing list