[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels
Reid Kleckner via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Nov 9 15:27:32 PST 2021
rnk added a comment.
In D112492#3119892 <https://reviews.llvm.org/D112492#3119892>, @tra wrote:
> Yes, we do need to merge identical functions with **identical names** for templates.
>
> The comdat-folding issue is different. IIUIC, it allows merging two functions with identical code and **different names**, into one function with two names. That will break CUDA as we do need to have each stub to have a unique address as we use it to find the matching GPU-side kernel.
Well, yes, ICF breaks function pointer identity. There's no way around that, and it is documented:
https://docs.microsoft.com/en-us/cpp/build/reference/opt-optimizations?view=msvc-170
CUDA users will have to remove /OPT:ICF from their linker flags.
Maybe you could make this work by embedding an ICF-breaking device into all the stubs. Something like a volatile asm blob that takes the current function as an argument and puts it in a register.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D112492/new/
https://reviews.llvm.org/D112492
More information about the cfe-commits
mailing list