[PATCH] D53636: Do not always request an implicit taskgroup region inside the kmpc_taskloop function

Sergi Mateo via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Oct 24 03:38:13 PDT 2018


smateo created this revision.
smateo added a reviewer: ABataev.
Herald added a subscriber: cfe-commits.

For the following code:

  int i;
  #pragma omp taskloop
  for (i = 0; i < 100; ++i)
  {}
  
  #pragma omp taskloop nogroup
  for (i = 0; i < 100; ++i)
  {}

Clang emits the following LLVM IR:

  ...
   call void @__kmpc_taskgroup(%struct.ident_t* @0, i32 %0)
   %2 = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @0, i32 %0, i32 1, i64 80, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates*)* @.omp_task_entry. to i32 (i32, i8*)*))
   ...
   call void @__kmpc_taskloop(%struct.ident_t* @0, i32 %0, i8* %2, i32 1, i64* %8, i64* %9, i64 %13, i32 0, i32 0, i64 0, i8* null)
   call void @__kmpc_end_taskgroup(%struct.ident_t* @0, i32 %0)
  
  
   ...
   %15 = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @0, i32 %0, i32 1, i64 80, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.1*)* @.omp_task_entry..2 to i32 (i32, i8*)*))
   ...
   call void @__kmpc_taskloop(%struct.ident_t* @0, i32 %0, i8* %15, i32 1, i64* %21, i64* %22, i64 %26, i32 0, i32 0, i64 0, i8* null)

The first set of instructions corresponds to the first taskloop construct. It is important to note that the implicit taskgroup region associated with the taskloop construct has been materialized in our IR:  the `__kmpc_taskloop` occurs inside a taskgroup region. Note also that this taskgroup region does not exist in our second taskloop because we are using the `nogroup` clause.

The issue here is the 4th argument of the kmpc_taskloop call, starting from the end,  is always a zero. Checking the LLVM OpenMP RT implementation, we see that this argument corresponds to the nogroup parameter:

  void __kmpc_taskloop(ident_t *loc, int gtid, kmp_task_t *task, int if_val,
                       kmp_uint64 *lb, kmp_uint64 *ub, kmp_int64 st, int nogroup,
                       int sched, kmp_uint64 grainsize, void *task_dup);

So basically we always tell to the RT to do another taskgroup region. For the first taskloop, this means that we create two taskgroup regions. For the second example, it means that despite the fact we had a nogroup clause we are going to have a taskgroup region, so we unnecessary wait until all descendant tasks have been executed.


Repository:
  rC Clang

https://reviews.llvm.org/D53636

Files:
  lib/CodeGen/CGOpenMPRuntime.cpp
  test/OpenMP/taskloop_codegen.cpp
  test/OpenMP/taskloop_firstprivate_codegen.cpp
  test/OpenMP/taskloop_lastprivate_codegen.cpp
  test/OpenMP/taskloop_private_codegen.cpp
  test/OpenMP/taskloop_reduction_codegen.cpp
  test/OpenMP/taskloop_simd_codegen.cpp
  test/OpenMP/taskloop_simd_firstprivate_codegen.cpp
  test/OpenMP/taskloop_simd_lastprivate_codegen.cpp
  test/OpenMP/taskloop_simd_private_codegen.cpp
  test/OpenMP/taskloop_simd_reduction_codegen.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D53636.170836.patch
Type: text/x-patch
Size: 29473 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20181024/785039a3/attachment-0001.bin>


More information about the cfe-commits mailing list