[PATCH] D114940: [OMPIRBuilder] Support ordered clause specified without parameter

Fri Jan 28 13:29:12 PST 2022

Meinersbur added inline comments.

================
Comment at: llvm/include/llvm/Frontend/OpenMP/OMPConstants.h:125-130
+  OrderedStaticChunked = 65,
+  OrderedStatic = 66, // ordered static unspecialized
+  OrderedDynamicChunked = 67,
+  OrderedGuidedChunked = 68,
+  OrderedRuntime = 69,
+  OrderedAuto = 70, // ordered auto
----------------
peixin wrote:
> Meinersbur wrote:
> > peixin wrote:
> > > Meinersbur wrote:
> > > > Did you consider making `ordered` a mask flag like Monotonic/Nonmonotonic?
> > > These values are copied from "kmp.h". I prefer to keep the originl definition rather than using a mask flag. In addition, the ordered values don't have the consistent mask such as kmp_sch_upper and kmp_ord_upper in "kmp.h".
> > There are used as flags though: https://github.com/llvm/llvm-project/blob/c94eb0f9ef5597bd1b3b0efda4df53f606a1fe13/openmp/runtime/src/kmp_dispatch.cpp#L185
> > 
> > However, there is an argument to keep the different definitions similar to each other. Eventually I hope those constant can come from a single source.
> It's true that "ordered" is used as flag in runtime library, but the schedule is updated by subtracting the offset instead of using one mask: https://github.com/llvm/llvm-project/blob/c94eb0f9ef5597bd1b3b0efda4df53f606a1fe13/openmp/runtime/src/kmp_dispatch.cpp#L187-L188.
> 
> I tried to use the offset, and the code would be like the following:
> 
> ```
> diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h b/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h
> index d2f9bac16e5a..adc5891b91f2 100644
> --- a/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h
> +++ b/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h
> @@ -113,8 +113,6 @@ inline std::string getAllAssumeClauseOptions() {
>  enum class OMPScheduleType {
>    StaticChunked = 33,
>    Static = 34, // static unspecialized
> -  DistributeChunked = 91,
> -  Distribute = 92,
>    DynamicChunked = 35,
>    GuidedChunked = 36, // guided unspecialized
>    Runtime = 37,
> @@ -124,6 +122,19 @@ enum class OMPScheduleType {
>    GuidedSimd = 46,            // guided with chunk adjustment
>    RuntimeSimd = 47,           // runtime with chunk adjustment
>  
> +  OrderedOffset = 32,     // offset for StaticChunked, Static, DynamicChunked,
> +                          // GuidedChunked, Runtime and Auto
> +  OrderedSimdOffset = 22, // offset for GuidedSimd and RuntimeSimd
> +  OrderedStaticChunked = 65,
> +  OrderedStatic = 66, // ordered static unspecialized
> +  OrderedDynamicChunked = 67,
> +  OrderedGuidedChunked = 68,
> +  OrderedRuntime = 69,
> +  OrderedAuto = 70, // ordered auto
> +
> +  DistributeChunked = 91, // distribute static chunked
> +  Distribute = 92,        // distribute static unspecialized
> +
>    ModifierMonotonic =
>        (1 << 29), // Set if the monotonic schedule modifier was present
>    ModifierNonmonotonic =
> diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
> index 38f93f7faf92..bdde5edf79b1 100644
> --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
> +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
> @@ -687,12 +687,27 @@ convertOmpWsLoop(Operation &opInst, llvm::IRBuilderBase &builder,
>  
>    bool isSimd = loop.simd_modifier();
>  
> -  if (schedule == omp::ClauseScheduleKind::Static) {
> +  // TODO: Emit kmpc_doacross_init call before kmpc_for_static_init call or
> +  // kmpc_dispatch_init call with the argument of orderedVal when orderedVal is
> +  // greater than 0.
> +  std::int64_t orderedVal =
> +      loop.ordered_val().hasValue() ? loop.ordered_val().getValue() : -1;
> +  if (schedule == omp::ClauseScheduleKind::Static && orderedVal != 0) {
>      ompBuilder->applyStaticWorkshareLoop(ompLoc.DL, loopInfo, allocaIP,
>                                           !loop.nowait(), chunk);
>    } else {
>      llvm::omp::OMPScheduleType schedType;
>      switch (schedule) {
> +    case omp::ClauseScheduleKind::Static:
> +      if (isSimd) {
> +        schedType = llvm::omp::OMPScheduleType::StaticBalancedChunked;
> +      } else {
> +        if (loop.schedule_chunk_var())
> +          schedType = llvm::omp::OMPScheduleType::StaticChunked;
> +        else
> +          schedType = llvm::omp::OMPScheduleType::Static;
> +      }
> +      break;
>      case omp::ClauseScheduleKind::Dynamic:
>        schedType = llvm::omp::OMPScheduleType::DynamicChunked;
>        break;
> @@ -716,6 +731,11 @@ convertOmpWsLoop(Operation &opInst, llvm::IRBuilderBase &builder,
>        break;
>      }
>  
> +    if (orderedVal == 0 && !isSimd)
> +      schedType = (llvm::omp::OMPScheduleType)(static_cast<int>(schedType) + static_cast<int>(llvm::omp::OMPScheduleType::OrderedOffset));
> +    else if (orderedVal == 0 && isSimd)
> +      schedType = (llvm::omp::OMPScheduleType)(static_cast<int>(schedType) + static_cast<int>(llvm::omp::OMPScheduleType::OrderedSimdOffset));
> +
>      if (loop.schedule_modifier().hasValue()) {
>        omp::ScheduleModifier modifier =
>            *omp::symbolizeScheduleModifier(loop.schedule_modifier().getValue());
> @@ -730,9 +750,21 @@ convertOmpWsLoop(Operation &opInst, llvm::IRBuilderBase &builder,
>          // Nothing to do here.
>          break;
>        }
> +    } else {
> +      // OpenMP 5.1, 2.11.4 Worksharing-Loop Construct, Desription.
> +      // If the static schedule kind is specified or if the ordered clause is
> +      // specified, and if the nonmonotonic modifier is not specified, the
> +      // effect is as if the monotonic modifier is specified. Otherwise, unless
> +      // the monotonic modifier is specified, the effect is as if the
> +      // nonmonotonic modifier is specified.
> +      if (!(static_cast<int>(schedType) == static_cast<int>(llvm::omp::OMPScheduleType::Static) + static_cast<int>(llvm::omp::OMPScheduleType::OrderedOffset) ||
> +            static_cast<int>(schedType) == static_cast<int>(llvm::omp::OMPScheduleType::StaticChunked) + static_cast<int>(llvm::omp::OMPScheduleType::OrderedOffset) ||
> +            static_cast<int>(schedType) == static_cast<int>(llvm::omp::OMPScheduleType::StaticBalancedChunked) + static_cast<int>(llvm::omp::OMPScheduleType::OrderedSimdOffset)))
> +        schedType |= llvm::omp::OMPScheduleType::ModifierNonmonotonic;
>      }
>      afterIP = ompBuilder->applyDynamicWorkshareLoop(
> -        ompLoc.DL, loopInfo, allocaIP, schedType, !loop.nowait(), chunk);
> +        ompLoc.DL, loopInfo, allocaIP, schedType, !loop.nowait(), chunk,
> +        /*ordered*/ orderedVal == 0);
>    }
>  
>    // Continue building IR after the loop. Note that the LoopInfo returned by
> ```
> This code looks kind of ugly. What do you think?
> It's true that "ordered" is used as flag in runtime library, but the schedule is updated by subtracting the offset instead of using one mask
```
schedule =
        (enum sched_type)(((int)schedule) - (kmp_ord_lower - kmp_sch_lower));
```
Can be seen as a hidden bit operation. With same equalities knowing the 64 bit is set:
if (64 & schedule) {
  schedule = ((int)schedule & ~64) + 32;
}
That is, it resets the `kmp_ord` flag but sets back that `kmp_sch` "flag". There is no schedule kind below 32 and there is also no schedule kind above 96 (both "flags" set), so the `kmp_sch` "flag" would normally not be needed. 

`kmp_nm` and `kmp_nm_ord` conforms this pattern as well. What I was suggesting is something like

```
enum class OMPScheduleType {
  StaticChunked = 1,
  Static = 2, // static unspecialized
  DynamicChunked = 3,
  GuidedChunked = 4, // guided unspecialized
  Runtime = 5,
  Auto = 6, // auto

  StaticBalancedChunked = 13, // static with chunk adjustment (e.g., simd)
  GuidedSimd = 14,            // guided with chunk adjustment
  RuntimeSimd = 15,           // runtime with chunk adjustment

  DistributeChunked = 27,
  Distribute = 28,

  ModifierUnordered =  (1 << 5),
  ModifierOrdered =  (1 << 6),
  ModifierNomerge =  (1 << 7),

  ModifierMonotonic =
      (1 << 29), // Set if the monotonic schedule modifier was present
  ModifierNonmonotonic =
      (1 << 30), // Set if the nonmonotonic schedule modifier was present
  ModifierMask = ModifierMonotonic | ModifierNonmonotonic | ModifierUnordered | ModifierOrdered | ModifierNm,
  LLVM_MARK_AS_BITMASK_ENUM(/* LargestValue */ ModifierMask)
```

However, as you mentioned it deviates from the exiting enum in kmp.h which might be confusing. So maybe lets ignore this possibility for now, although it would simplify the giant switch case a lot.

================
Comment at: mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp:765
+  // greater than 0.
+  std::int64_t orderedVal =
+      loop.ordered_val().hasValue() ? loop.ordered_val().getValue() : -1;
----------------

================
Comment at: mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp:768
+  if (schedule == omp::ClauseScheduleKind::Static && orderedVal != 0) {
     ompBuilder->applyStaticWorkshareLoop(ompLoc.DL, loopInfo, allocaIP,
                                          !loop.nowait(), chunk);
----------------
Is the ordered incompatible with static schedules?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114940/new/

https://reviews.llvm.org/D114940