[Openmp-commits] [PATCH] D51624: [libomptarget][CUDA] Use cuDeviceGetAttribute, NFCI.
Kelvin Li via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Tue Sep 4 06:16:57 PDT 2018
kkwli0 added inline comments.
================
Comment at: libomptarget/plugins/cuda/src/rtl.cpp:305
- // Get threads per block, exploit threads only along x axis
- if (Properties.maxThreadsDim[0] <= RTLDeviceInfoTy::HardThreadLimit) {
- DeviceInfo.ThreadsPerBlock[device_id] = Properties.maxThreadsDim[0];
- DP("Using %d CUDA threads per block\n", Properties.maxThreadsDim[0]);
- if (Properties.maxThreadsDim[0] < Properties.maxThreadsPerBlock) {
- DP("(fewer than max per block along all xyz dims %d)\n",
- Properties.maxThreadsPerBlock);
- }
- } else {
- DeviceInfo.ThreadsPerBlock[device_id] = RTLDeviceInfoTy::HardThreadLimit;
- DP("Max CUDA threads per block %d exceeds the hard thread limit %d, "
- "capping at the hard limit\n", Properties.maxThreadsDim[0],
- RTLDeviceInfoTy::HardThreadLimit);
- }
+ // We are only exploting threads along the x axis.
+ int maxBlockDimX;
----------------
exploiting
Repository:
rOMP OpenMP
https://reviews.llvm.org/D51624
More information about the Openmp-commits
mailing list