[llvm] [Offload] Allow CUDA Kernels to use arbitrarily large shared memory (PR #145963)
Joseph Huber via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 1 09:31:02 PDT 2025
================
@@ -1302,6 +1305,16 @@ Error CUDAKernelTy::launchImpl(GenericDeviceTy &GenericDevice,
if (GenericDevice.getRPCServer())
GenericDevice.Plugin.getRPCServer().Thread->notify();
+ // In case we require more memory than the current limit.
+ if (MaxDynCGroupMem >= MaxDynCGroupMemLimit) {
----------------
jhuber6 wrote:
This is part of why I think we should remove more OpenMP logic from the core library. It would be better to make this guarantee at the `libomptarget` layer, since we likely want to maintain opt-in behavior. Worst case we use yet another environment variable.
https://github.com/llvm/llvm-project/pull/145963
More information about the llvm-commits
mailing list