[Openmp-commits] [PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive
Sandeep via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Fri Jun 30 22:13:06 PDT 2023
sandeepkosuri added inline comments.
================
Comment at: clang/lib/CodeGen/CGOpenMPRuntime.cpp:9818
+ D.hasClausesOfKind<OMPInReductionClause>() ||
+ (CGM.getLangOpts().OpenMP >= 51 && D.getDirectiveKind() == OMPD_target &&
+ D.hasClausesOfKind<OMPThreadLimitClause>());
----------------
ABataev wrote:
> What if D is combined target directive, i.e. D.getDirectiveKind() is something like OMPD_target_teams, etc.?
I will fix that, thanks for noticing.
================
Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:5143-5148
+ S.getSingleClause<OMPThreadLimitClause>()) {
+ // Emit __kmpc_set_thread_limit() to set the thread_limit for the task
+ // enclosing this target region. This will indirectly set the thread_limit
+ // for every applicable construct within target region.
+ CGF.CGM.getOpenMPRuntime().emitThreadLimitClause(
+ CGF, S.getSingleClause<OMPThreadLimitClause>()->getThreadLimit(),
----------------
ABataev wrote:
> Avoid double call of S.getSingleClause<OMPThreadLimitClause>(), store in local variable call result.
sure.
================
Comment at: clang/test/OpenMP/target_codegen.cpp:849
// OMP51: [[CE:%.*]] = load {{.*}} [[CEA]]
-// OMP51: call i32 @__tgt_target_kernel({{.*}}, i64 -1, i32 -1, i32 [[CE]],
+// OMP51: call ptr @__kmpc_omp_task_alloc({{.*@.omp_task_entry.*}})
+// OMP51: call i32 [[OMP_TASK_ENTRY]]
----------------
ABataev wrote:
> It requires extra resource consumption, can you try to avoid creating outer task, if possible?
I tried different ideas for making `thread_limit` work on `target`.
I tried to reuse the existing implementation by replacing the directive to `target teams(1) thread_limit(x)` at parsing , sema and IR stages. I couldn't successfully implement any of them. So, I tried adding `num_threads` for all the parallel directives within `target`, and there were corner cases like parallel directives in a function which is called in target region, which were becoming tedious to handle.
This method seem to encompass the idea of thread limit on `target` pretty well and also works... So I proceeded with this idea.
================
Comment at: openmp/runtime/src/kmp_ftn_entry.h:809
+ return thread_limit;
+ else
+ return thread->th.th_current_task->td_icvs.thread_limit;
----------------
ABataev wrote:
> No need for else here
oops, I will fix that
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D152054/new/
https://reviews.llvm.org/D152054
More information about the Openmp-commits
mailing list