[PATCH] D133726: [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU
Jon Chesterfield via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Sep 12 15:39:28 PDT 2022
JonChesterfield added a comment.
We can do this but should expect an increase in code size from having multiple internalised copies of the same function. There may be an incidental benefit if we can specialise some functions to the call site without additional cloning. Address of the same functions from different TUs will be inequal, which is wrong, but probably doesn't matter in practice.
It does have the major advantage that mlink-builtin-bitcode patches up the invalid IR on the fly, which is likely easier than changing the device libs or making IR mcpu-agnostic.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D133726/new/
https://reviews.llvm.org/D133726
More information about the cfe-commits
mailing list