[Openmp-commits] [PATCH] D119313: [OpenMP][CUDA] Set the hard team limit to 2^31-1
Shilei Tian via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Wed Feb 9 15:08:06 PST 2022
tianshilei1992 added a comment.
In D119313#3309288 <https://reviews.llvm.org/D119313#3309288>, @jdoerfert wrote:
> We ask the device how many blocks/teams it supports, do we not?
Yes, we do. Then after that query, we also have a capping.
// Query attributes to determine number of threads/block and blocks/grid.
int MaxGridDimX;
Err = cuDeviceGetAttribute(&MaxGridDimX, CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_X,
Device);
if (Err != CUDA_SUCCESS) {
DP("Error getting max grid dimension, use default value %d\n",
DeviceRTLTy::DefaultNumTeams);
DeviceData[DeviceId].BlocksPerGrid = DeviceRTLTy::DefaultNumTeams;
} else if (MaxGridDimX <= DeviceRTLTy::HardTeamLimit) {
DP("Using %d CUDA blocks per grid\n", MaxGridDimX);
DeviceData[DeviceId].BlocksPerGrid = MaxGridDimX;
} else {
DP("Max CUDA blocks per grid %d exceeds the hard team limit %d, capping "
"at the hard limit\n",
MaxGridDimX, DeviceRTLTy::HardTeamLimit);
DeviceData[DeviceId].BlocksPerGrid = DeviceRTLTy::HardTeamLimit;
}
So actually, I think we even don't need this "hard" limit. We don't have that limit in the spec. I don't know why they are there at the first place.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D119313/new/
https://reviews.llvm.org/D119313
More information about the Openmp-commits
mailing list