[Openmp-commits] [PATCH] D51624: [libomptarget][CUDA] Use cuDeviceGetAttribute, NFCI.

Kelvin Li via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Tue Sep 4 06:16:57 PDT 2018


kkwli0 added inline comments.


================
Comment at: libomptarget/plugins/cuda/src/rtl.cpp:305
 
-    // Get threads per block, exploit threads only along x axis
-    if (Properties.maxThreadsDim[0] <= RTLDeviceInfoTy::HardThreadLimit) {
-      DeviceInfo.ThreadsPerBlock[device_id] = Properties.maxThreadsDim[0];
-      DP("Using %d CUDA threads per block\n", Properties.maxThreadsDim[0]);
-      if (Properties.maxThreadsDim[0] < Properties.maxThreadsPerBlock) {
-        DP("(fewer than max per block along all xyz dims %d)\n",
-            Properties.maxThreadsPerBlock);
-      }
-    } else {
-      DeviceInfo.ThreadsPerBlock[device_id] = RTLDeviceInfoTy::HardThreadLimit;
-      DP("Max CUDA threads per block %d exceeds the hard thread limit %d, "
-          "capping at the hard limit\n", Properties.maxThreadsDim[0],
-          RTLDeviceInfoTy::HardThreadLimit);
-    }
+  // We are only exploting threads along the x axis.
+  int maxBlockDimX;
----------------
exploiting


Repository:
  rOMP OpenMP

https://reviews.llvm.org/D51624





More information about the Openmp-commits mailing list