[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels

Reid Kleckner via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Nov 9 15:27:32 PST 2021


rnk added a comment.

In D112492#3119892 <https://reviews.llvm.org/D112492#3119892>, @tra wrote:

> Yes, we do need to merge identical functions with **identical names** for templates.
>
> The comdat-folding issue is different. IIUIC, it allows merging two functions with identical code and **different names**, into one function with two names. That will break CUDA as we do need to have each stub to have a unique address as we use it to find the matching GPU-side kernel.

Well, yes, ICF breaks function pointer identity. There's no way around that, and it is documented:
https://docs.microsoft.com/en-us/cpp/build/reference/opt-optimizations?view=msvc-170
CUDA users will have to remove /OPT:ICF from their linker flags.

Maybe you could make this work by embedding an ICF-breaking device into all the stubs. Something like a volatile asm blob that takes the current function as an argument and puts it in a register.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112492/new/

https://reviews.llvm.org/D112492



More information about the cfe-commits mailing list