[PATCH] D133726: [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU
Joseph Huber via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Sep 12 16:22:25 PDT 2022
jhuber6 added a comment.
In D133726#3785040 <https://reviews.llvm.org/D133726#3785040>, @JonChesterfield wrote:
> We can do this but should expect an increase in code size from having multiple internalised copies of the same function. There may be an incidental benefit if we can specialise some functions to the call site without additional cloning. Address of the same functions from different TUs will be inequal, which is wrong, but probably doesn't matter in practice.
>
> It does have the major advantage that mlink-builtin-bitcode patches up the invalid IR on the fly, which is likely easier than changing the device libs or making IR mcpu-agnostic.
It will probably decrease code size in the final executable now that this will forcefully internalize all the `protected` functions in `ocml.bc` that were sticking around because LTO couldn't remove them due to visibility.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D133726/new/
https://reviews.llvm.org/D133726
More information about the cfe-commits
mailing list