[llvm-branch-commits] [openmp] 9bc22aa - [OpenMP][CUDA] Refine the logic to determine grid size
Tom Stellard via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Wed Feb 16 10:52:21 PST 2022
Author: Shilei Tian
Date: 2022-02-16T10:51:45-08:00
New Revision: 9bc22aa5078020bd76d5e4150d39e929c63cc355
URL: https://github.com/llvm/llvm-project/commit/9bc22aa5078020bd76d5e4150d39e929c63cc355
DIFF: https://github.com/llvm/llvm-project/commit/9bc22aa5078020bd76d5e4150d39e929c63cc355.diff
LOG: [OpenMP][CUDA] Refine the logic to determine grid size
This patch refines the logic to determine grid size as previous method
can escape the check of whether `CudaBlocksPerGrid` could be greater than the actual
hardware limit.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D119311
(cherry picked from commit f6685f774697c85d6a352dcea013f46a99f9fe31)
Added:
Modified:
openmp/libomptarget/plugins/cuda/src/rtl.cpp
Removed:
################################################################################
diff --git a/openmp/libomptarget/plugins/cuda/src/rtl.cpp b/openmp/libomptarget/plugins/cuda/src/rtl.cpp
index e17593878b7c7..0ca05f0ec3a0f 100644
--- a/openmp/libomptarget/plugins/cuda/src/rtl.cpp
+++ b/openmp/libomptarget/plugins/cuda/src/rtl.cpp
@@ -1170,15 +1170,17 @@ class DeviceRTLTy {
DP("Using default number of teams %d\n", DeviceData[DeviceId].NumTeams);
CudaBlocksPerGrid = DeviceData[DeviceId].NumTeams;
}
- } else if (TeamNum > DeviceData[DeviceId].BlocksPerGrid) {
- DP("Capping number of teams to team limit %d\n",
- DeviceData[DeviceId].BlocksPerGrid);
- CudaBlocksPerGrid = DeviceData[DeviceId].BlocksPerGrid;
} else {
DP("Using requested number of teams %d\n", TeamNum);
CudaBlocksPerGrid = TeamNum;
}
+ if (CudaBlocksPerGrid > DeviceData[DeviceId].BlocksPerGrid) {
+ DP("Capping number of teams to team limit %d\n",
+ DeviceData[DeviceId].BlocksPerGrid);
+ CudaBlocksPerGrid = DeviceData[DeviceId].BlocksPerGrid;
+ }
+
INFO(OMP_INFOTYPE_PLUGIN_KERNEL, DeviceId,
"Launching kernel %s with %d blocks and %d threads in %s mode\n",
(getOffloadEntry(DeviceId, TgtEntryPtr))
More information about the llvm-branch-commits
mailing list