[PATCH] D146448: [CUDA] Update cached kernel handle when the function instance changes.

Daniele Castagna via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Mar 20 13:10:41 PDT 2023


dcastagna added a comment.

Thank you Artem!



================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:1201
+      return Loc->second;
+    // non-HIP compilation may end up with a different F and need to have
+    // handles and stubs updated.
----------------
The different F is because sometimes at first we use an F with incomplete types as a key and later we use a different F (even if it's coming from the same function) with the instantiated types (usually after we encounter a statement where the size of the template parameter is necessary, like a function call).
Another option could be to always instantiate the complete type for F when we call GetOrCreateLLVMFunction (IIRC) if the function has a __global__ attribute.

If you prefer this solution, the patch LGTM.




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146448/new/

https://reviews.llvm.org/D146448



More information about the cfe-commits mailing list