[Openmp-commits] [PATCH] D158382: [OpenMP] Use default grid value for static grid size

Johannes Doerfert via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Mon Aug 28 12:54:21 PDT 2023


jdoerfert added a comment.

In D158382#4621885 <https://reviews.llvm.org/D158382#4621885>, @dhruvachak wrote:

> This patch produces the following difference in IR out of CodeGen.
>
> Without this patch:
>
>   %nvptx_num_threads.i = tail call i32 @__kmpc_get_hardware_num_threads_in_block() #2
>   call void @__kmpc_distribute_static_init_4(ptr addrspacecast (ptr addrspace(1) @2 to ptr), i32 %1, i32 91, ptr nonnull %.omp.is_last.ascast.i, ptr nonnull %.omp.comb.lb.ascast.i, ptr nonnull %.omp.comb.ub.ascast.i, ptr nonnull %.omp.stride.ascast.i, i32 1, i32 %nvptx_num_threads.i) #2
>
> With this patch:
>
>   call void @__kmpc_distribute_static_init_4(ptr addrspacecast (ptr addrspace(1) @2 to ptr), i32 %1, i32 91, ptr nonnull %.omp.is_last.ascast.i, ptr nonnull %.omp.comb.lb.ascast.i, ptr nonnull %.omp.comb.ub.ascast.i, ptr nonnull %.omp.stride.ascast.i, i32 1, i32 256) #2
>
> Setting the blocksize to a constant too early would be a problem if the runtime changes the blocksize, e.g. because of an environment variable or because of a low trip count (D152014 <https://reviews.llvm.org/D152014>). Comments? @jdoerfert

>From OpenMP-Opt:

  case OMPRTL___kmpc_get_hardware_num_threads_in_block:
     Changed = Changed | foldKernelFnAttribute(A, "omp_target_thread_limit");
     break;

this is wrong. We should fold thread limit, not num_threads_in_block.
The latter can only be folded if we will not lower it, which we currently cannot guarantee.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158382/new/

https://reviews.llvm.org/D158382



More information about the Openmp-commits mailing list