[Openmp-commits] [PATCH] D119313: [OpenMP][CUDA] Set the hard team limit to 2^31-1
Shilei Tian via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Thu Feb 10 10:55:41 PST 2022
tianshilei1992 added a comment.
Actually, the hard limit of 65536 can help with performance in some cases. For example, for BabelStream benchmark, if we don't cap the team number, it could have 262144 blocks. After capping to 65536, the performance improved a lot.
Capping to 65536:
Type Time(%) Time Calls Avg Min Max Name
21.96% 361.91ms 100 3.6191ms 3.4761ms 4.2035ms __omp_offloading_fd02_c612a6__ZN9OMPStreamIdE3dotEv_l229
12.50% 205.93ms 100 2.0593ms 2.0200ms 2.0720ms __omp_offloading_fd02_c612a6__ZN9OMPStreamIdE5triadEv_l180
12.40% 204.35ms 100 2.0435ms 2.0084ms 2.0561ms __omp_offloading_fd02_c612a6__ZN9OMPStreamIdE3addEv_l155
8.57% 141.31ms 100 1.4131ms 1.3905ms 1.4901ms __omp_offloading_fd02_c612a6__ZN9OMPStreamIdE3mulEv_l132
8.53% 140.61ms 100 1.4061ms 1.3885ms 1.4647ms __omp_offloading_fd02_c612a6__ZN9OMPStreamIdE4copyEv_l108
0.20% 3.2532ms 1 3.2532ms 3.2532ms 3.2532ms __omp_offloading_fd02_c612a6__ZN9OMPStreamIdE11init_arraysEddd_l62
Not capping, grid size 262144:
Type Time(%) Time Calls Avg Min Max Name
GPU activities: 34.48% 682.98ms 100 6.8298ms 6.6153ms 7.8655ms __omp_offloading_fd02_c612a6__ZN9OMPStreamIdE3dotEv_l229
10.30% 204.15ms 100 2.0415ms 2.0165ms 2.0479ms __omp_offloading_fd02_c612a6__ZN9OMPStreamIdE5triadEv_l180
10.26% 203.31ms 100 2.0331ms 2.0084ms 2.0385ms __omp_offloading_fd02_c612a6__ZN9OMPStreamIdE3addEv_l155
7.51% 148.83ms 100 1.4883ms 1.4327ms 1.7717ms __omp_offloading_fd02_c612a6__ZN9OMPStreamIdE3mulEv_l132
7.46% 147.83ms 100 1.4783ms 1.4251ms 1.7499ms __omp_offloading_fd02_c612a6__ZN9OMPStreamIdE4copyEv_l108
0.15% 3.0440ms 1 3.0440ms 3.0440ms 3.0440ms __omp_offloading_fd02_c612a6__ZN9OMPStreamIdE11init_arraysEddd_l62
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D119313/new/
https://reviews.llvm.org/D119313
More information about the Openmp-commits
mailing list