[PATCH] D146448: [CUDA] Update cached kernel handle when the function instance changes.
Daniele Castagna via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Mar 20 13:10:41 PDT 2023
dcastagna added a comment.
Thank you Artem!
================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:1201
+ return Loc->second;
+ // non-HIP compilation may end up with a different F and need to have
+ // handles and stubs updated.
----------------
The different F is because sometimes at first we use an F with incomplete types as a key and later we use a different F (even if it's coming from the same function) with the instantiated types (usually after we encounter a statement where the size of the template parameter is necessary, like a function call).
Another option could be to always instantiate the complete type for F when we call GetOrCreateLLVMFunction (IIRC) if the function has a __global__ attribute.
If you prefer this solution, the patch LGTM.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D146448/new/
https://reviews.llvm.org/D146448
More information about the cfe-commits
mailing list