chsigg wrote: > Would it be possible to lazy load on first use? Yes, I reverted the eager loading in the runtime. Instead, one can use `CUDA_MODULE_LOADING=EAGER` env variable to force eager loading. https://github.com/llvm/llvm-project/pull/135478