[clang] [clang-repl][CUDA] Move CUDA module registration to beginning of global_ctors (PR #66658)

Mon Sep 18 13:22:28 PDT 2023

================
@@ -794,7 +794,7 @@ void CodeGenModule::Release() {
       AddGlobalCtor(ObjCInitFunction);
   if (Context.getLangOpts().CUDA && CUDARuntime) {
     if (llvm::Function *CudaCtorFunction = CUDARuntime->finalizeModule())
-      AddGlobalCtor(CudaCtorFunction);
+      AddGlobalCtor(CudaCtorFunction, /*Priority=*/0);
----------------
Artem-B wrote:

This is a very contrived example. While I agree that it currently does not work with CUDA, I am still not convinced that it is a problem that needs to be solved in clang.

Let's assume you've set the priority at X. Launching kernels from dynamic initializers with higher priority will still be broken, so the patch does not solve the problem conceptually.

If you set the priority of CUDA kernel initializers at the highest level (is that the ntent of priority=0?), can you guarantee that kernel registration never depends on anything else that was expected to get initialized before it? We also no longer have *any* wiggle room to run anything before kernel registration when we need to.

@MaskRay Fangrui, WDYT about bumping dynamic initializer priority in principle? Is there anything else we need to worry about?


https://github.com/llvm/llvm-project/pull/66658