[Openmp-commits] [PATCH] D51624: [libomptarget][CUDA] Use	cuDeviceGetAttribute, NFCI.
    Kelvin Li via Phabricator via Openmp-commits 
    openmp-commits at lists.llvm.org
       
    Tue Sep  4 06:16:57 PDT 2018
    
    
  
kkwli0 added inline comments.
================
Comment at: libomptarget/plugins/cuda/src/rtl.cpp:305
 
-    // Get threads per block, exploit threads only along x axis
-    if (Properties.maxThreadsDim[0] <= RTLDeviceInfoTy::HardThreadLimit) {
-      DeviceInfo.ThreadsPerBlock[device_id] = Properties.maxThreadsDim[0];
-      DP("Using %d CUDA threads per block\n", Properties.maxThreadsDim[0]);
-      if (Properties.maxThreadsDim[0] < Properties.maxThreadsPerBlock) {
-        DP("(fewer than max per block along all xyz dims %d)\n",
-            Properties.maxThreadsPerBlock);
-      }
-    } else {
-      DeviceInfo.ThreadsPerBlock[device_id] = RTLDeviceInfoTy::HardThreadLimit;
-      DP("Max CUDA threads per block %d exceeds the hard thread limit %d, "
-          "capping at the hard limit\n", Properties.maxThreadsDim[0],
-          RTLDeviceInfoTy::HardThreadLimit);
-    }
+  // We are only exploting threads along the x axis.
+  int maxBlockDimX;
----------------
exploiting
Repository:
  rOMP OpenMP
https://reviews.llvm.org/D51624
    
    
More information about the Openmp-commits
mailing list