[PATCH] D106033: Folding threadLimit and numThreads when single value in kernels
Jon Chesterfield via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 15 09:32:08 PDT 2021
JonChesterfield added a comment.
Renaming the functions and adding to OMPKinds.def et al looks separate to folding them, should land that first. Makes the functional diff easier to read and reduces the amount of churn if the functional change needs to be backed out.
================
Comment at: openmp/libomptarget/deviceRTLs/common/src/omptarget.cu:98
+ 1 + (__kmpc_get_hardware_num_threads_in_block() > 1 ? OMP_ACTIVE_PARALLEL_LEVEL : 0);
}
----------------
`git-clang-format HEAD^` may be useful for things like this
================
Comment at: openmp/libomptarget/deviceRTLs/common/support.h:53
// get OpenMP number of threads and team
-int GetNumberOfOmpThreads(bool isSPMDExecutionMode); // omp_num_threads
-int GetNumberOfOmpTeams(); // omp_num_teams
+NOINLINE int GetNumberOfOmpThreads(bool isSPMDExecutionMode); // omp_num_threads
+NOINLINE int GetNumberOfOmpTeams(); // omp_num_teams
----------------
Noinline seems bad. Why? Also looks like the functions tagged noinline here are different to the ones that are renamed.
================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu:97
EXTERN int GetBlockIdInKernel() { return __nvvm_read_ptx_sreg_ctaid_x(); }
-EXTERN int GetNumberOfBlocksInKernel() {
+EXTERN int __kmpc_get_hardware_num_blocks() {
return __nvvm_read_ptx_sreg_nctaid_x();
----------------
rename the functions in amdgcn/src/target_impl.hip too please
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D106033/new/
https://reviews.llvm.org/D106033
More information about the llvm-commits
mailing list