[llvm] [offload] Add properties parameter to olLaunchKernel (PR #184343)

Kevin Sala Penades via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 4 15:06:55 PST 2026


================
@@ -1495,9 +1502,31 @@ Error CUDAKernelTy::launchImpl(GenericDeviceTy &GenericDevice,
     MaxDynCGroupMemLimit = MaxDynCGroupMem;
   }
 
-  CUresult Res = cuLaunchKernel(Func, NumBlocks[0], NumBlocks[1], NumBlocks[2],
-                                NumThreads[0], NumThreads[1], NumThreads[2],
-                                MaxDynCGroupMem, Stream, nullptr, Config);
+  // Validate cooperative launch requirements if needed
+  if (KernelArgs.Flags.Cooperative) {
+    CUDADeviceTy &CUDADevice = static_cast<CUDADeviceTy &>(GenericDevice);
+
+    uint32_t SupportsCooperative = 0;
+    if (auto Err = CUDADevice.getDeviceAttr(
+            CU_DEVICE_ATTRIBUTE_COOPERATIVE_LAUNCH, SupportsCooperative))
+      return Err;
+
+    if (!SupportsCooperative)
+      return Plugin::error(ErrorCode::UNSUPPORTED,
+                           "Device does not support cooperative launch");
+  }
+
+  CUlaunchAttribute CoopAttr = {
+      CU_LAUNCH_ATTRIBUTE_COOPERATIVE,
+      {.cooperative = KernelArgs.Flags.Cooperative ? 1 : 0}};
+
+  CUlaunchConfig LaunchConfig = {NumBlocks[0],    NumBlocks[1],
+                                 NumBlocks[2],    NumThreads[0],
+                                 NumThreads[1],   NumThreads[2],
----------------
kevinsala wrote:

I believe you reversed the order of blocks and threads. Please see: https://docs.nvidia.com/cuda/cuda-driver-api/structCUlaunchConfig.html#structCUlaunchConfig

https://github.com/llvm/llvm-project/pull/184343


More information about the llvm-commits mailing list