[llvm-branch-commits] [openmp] 9bc22aa - [OpenMP][CUDA] Refine the logic to determine grid size

Tom Stellard via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Wed Feb 16 10:52:21 PST 2022


Author: Shilei Tian
Date: 2022-02-16T10:51:45-08:00
New Revision: 9bc22aa5078020bd76d5e4150d39e929c63cc355

URL: https://github.com/llvm/llvm-project/commit/9bc22aa5078020bd76d5e4150d39e929c63cc355
DIFF: https://github.com/llvm/llvm-project/commit/9bc22aa5078020bd76d5e4150d39e929c63cc355.diff

LOG: [OpenMP][CUDA] Refine the logic to determine grid size

This patch refines the logic to determine grid size as previous method
can escape the check of whether `CudaBlocksPerGrid` could be greater than the actual
hardware limit.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D119311

(cherry picked from commit f6685f774697c85d6a352dcea013f46a99f9fe31)

Added: 
    

Modified: 
    openmp/libomptarget/plugins/cuda/src/rtl.cpp

Removed: 
    


################################################################################
diff  --git a/openmp/libomptarget/plugins/cuda/src/rtl.cpp b/openmp/libomptarget/plugins/cuda/src/rtl.cpp
index e17593878b7c7..0ca05f0ec3a0f 100644
--- a/openmp/libomptarget/plugins/cuda/src/rtl.cpp
+++ b/openmp/libomptarget/plugins/cuda/src/rtl.cpp
@@ -1170,15 +1170,17 @@ class DeviceRTLTy {
         DP("Using default number of teams %d\n", DeviceData[DeviceId].NumTeams);
         CudaBlocksPerGrid = DeviceData[DeviceId].NumTeams;
       }
-    } else if (TeamNum > DeviceData[DeviceId].BlocksPerGrid) {
-      DP("Capping number of teams to team limit %d\n",
-         DeviceData[DeviceId].BlocksPerGrid);
-      CudaBlocksPerGrid = DeviceData[DeviceId].BlocksPerGrid;
     } else {
       DP("Using requested number of teams %d\n", TeamNum);
       CudaBlocksPerGrid = TeamNum;
     }
 
+    if (CudaBlocksPerGrid > DeviceData[DeviceId].BlocksPerGrid) {
+      DP("Capping number of teams to team limit %d\n",
+         DeviceData[DeviceId].BlocksPerGrid);
+      CudaBlocksPerGrid = DeviceData[DeviceId].BlocksPerGrid;
+    }
+
     INFO(OMP_INFOTYPE_PLUGIN_KERNEL, DeviceId,
          "Launching kernel %s with %d blocks and %d threads in %s mode\n",
          (getOffloadEntry(DeviceId, TgtEntryPtr))


        


More information about the llvm-branch-commits mailing list