[llvm] [offload] Add properties parameter to olLaunchKernel (PR #184343)

Ɓukasz Plewa via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 4 09:50:26 PST 2026


================
@@ -1512,6 +1566,71 @@ Error CUDAKernelTy::launchImpl(GenericDeviceTy &GenericDevice,
   return Plugin::check(Res, "error in cuLaunchKernel for '%s': %s", getName());
 }
 
+Expected<uint32_t> CUDAKernelTy::getMaxCooperativeGroupCount(
+    GenericDeviceTy &GenericDevice, uint32_t WorkDim,
+    const size_t *LocalWorkSize, size_t DynamicSharedMemorySize) const {
+  CUDADeviceTy &CUDADevice = static_cast<CUDADeviceTy &>(GenericDevice);
+
+  uint32_t SupportsCooperative = 0;
+  if (auto Err = CUDADevice.getDeviceAttr(
+          CU_DEVICE_ATTRIBUTE_COOPERATIVE_LAUNCH, SupportsCooperative))
+    return Err;
+
+  if (!SupportsCooperative) {
+    return Plugin::error(ErrorCode::UNSUPPORTED,
+                         "Device does not support cooperative launch");
+  }
+
+  // Calculate total local work size
+  size_t LocalWorkSizeTotal = LocalWorkSize[0];
+  LocalWorkSizeTotal *= (WorkDim >= 2 ? LocalWorkSize[1] : 1);
+  LocalWorkSizeTotal *= (WorkDim == 3 ? LocalWorkSize[2] : 1);
+
+  // Query max active blocks per multiprocessor
+  int MaxNumActiveGroupsPerCU = 0;
+  CUresult Res = cuOccupancyMaxActiveBlocksPerMultiprocessor(
+      &MaxNumActiveGroupsPerCU, Func, LocalWorkSizeTotal,
+      DynamicSharedMemorySize);
+  if (auto Err = Plugin::check(
+          Res, "error in cuOccupancyMaxActiveBlocksPerMultiprocessor: %s"))
+    return Err;
+
+  assert(MaxNumActiveGroupsPerCU >= 0);
----------------
lplewa wrote:

Cuda api say that this value is int.  Based on documentation it should never be negative, so we can assume that is unsigned. I added assert just to indicate this assumption. If i really expected negative value there would be `if(<0) return error;` 

https://github.com/llvm/llvm-project/pull/184343


More information about the llvm-commits mailing list