[clang] [OpenMP][Clang] Force use of `num_teams` and `thread_limit` for bare kernel (PR #68373)
Shilei Tian via cfe-commits
cfe-commits at lists.llvm.org
Thu Oct 5 20:39:42 PDT 2023
shiltian wrote:
> Does `thread_limit` directly imply the number of threads? I thought that it merely set an upper bound such that it cannot be increased beyond that via environment variables.
Based on the spec, yes. However, here since `ompx_bare` is an extension, we can redefine semantics. For example, we already redefine the `target teams` region such that globalization is disabled. We can say, `thread_limit` in this mode is to set the block size. I'll have a follow-up patch to change the runtime behavior that if user's grid size can not be met in such kernel mode, crash directly, similar to CUDA/HIP.
https://github.com/llvm/llvm-project/pull/68373
More information about the cfe-commits
mailing list