[PATCH] D106033: Folding threadLimit and numThreads when single value in kernels

Jon Chesterfield via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 15 09:32:08 PDT 2021


JonChesterfield added a comment.

Renaming the functions and adding to OMPKinds.def et al looks separate to folding them, should land that first. Makes the functional diff easier to read and reduces the amount of churn if the functional change needs to be backed out.



================
Comment at: openmp/libomptarget/deviceRTLs/common/src/omptarget.cu:98
+        1 + (__kmpc_get_hardware_num_threads_in_block() > 1 ? OMP_ACTIVE_PARALLEL_LEVEL : 0);
   }
 
----------------
`git-clang-format HEAD^` may be useful for things like this


================
Comment at: openmp/libomptarget/deviceRTLs/common/support.h:53
 // get OpenMP number of threads and team
-int GetNumberOfOmpThreads(bool isSPMDExecutionMode); // omp_num_threads
-int GetNumberOfOmpTeams();                           // omp_num_teams
+NOINLINE int GetNumberOfOmpThreads(bool isSPMDExecutionMode); // omp_num_threads
+NOINLINE int GetNumberOfOmpTeams();                           // omp_num_teams
----------------
Noinline seems bad. Why? Also looks like the functions tagged noinline here are different to the ones that are renamed.


================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu:97
 EXTERN int GetBlockIdInKernel() { return __nvvm_read_ptx_sreg_ctaid_x(); }
-EXTERN int GetNumberOfBlocksInKernel() {
+EXTERN int __kmpc_get_hardware_num_blocks() {
   return __nvvm_read_ptx_sreg_nctaid_x();
----------------
rename the functions in amdgcn/src/target_impl.hip too please


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106033/new/

https://reviews.llvm.org/D106033



More information about the llvm-commits mailing list