[Mlir-commits] [flang] [llvm] [mlir] Revert "[MLIR][OpenMP] Add MLIR Lowering Support for dist_schedule (#… (PR #169821)

llvmlistbot at llvm.org llvmlistbot at llvm.org
Thu Nov 27 07:42:48 PST 2025


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-mlir-llvm

Author: Jack Styles (Stylie777)

<details>
<summary>Changes</summary>

Reverts llvm/llvm-project#<!-- -->152736 due to Build Bot Failures.

---

Patch is 43.22 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/169821.diff


10 Files Affected:

- (modified) flang/docs/OpenMPSupport.md (+11-11) 
- (modified) llvm/include/llvm/Frontend/OpenMP/OMP.td (+1-3) 
- (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+7-27) 
- (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+49-140) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (-1) 
- (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+12-24) 
- (modified) mlir/test/Conversion/OpenMPToLLVM/convert-to-llvmir.mlir (-19) 
- (removed) mlir/test/Target/LLVMIR/openmp-dist_schedule.mlir (-34) 
- (removed) mlir/test/Target/LLVMIR/openmp-dist_schedule_with_wsloop.mlir (-205) 
- (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (+13) 


``````````diff
diff --git a/flang/docs/OpenMPSupport.md b/flang/docs/OpenMPSupport.md
index 8eea39c6ba91b..81f5f9f6dee5b 100644
--- a/flang/docs/OpenMPSupport.md
+++ b/flang/docs/OpenMPSupport.md
@@ -42,10 +42,10 @@ Note : No distinction is made between the support in Parser/Semantics, MLIR, Low
 | target update construct                                    | P      | device clause not supported |
 | declare target directive                                   | P      | |
 | teams construct                                            | Y      | |
-| distribute construct                                       | P      | |
-| distribute simd construct                                  | P      | linear clauses are not supported |
-| distribute parallel loop construct                         | P      | |
-| distribute parallel loop simd construct                    | P      | linear clauses are not supported |
+| distribute construct                                       | P      | dist_schedule clause not supported |
+| distribute simd construct                                  | P      | dist_schedule and linear clauses are not supported |
+| distribute parallel loop construct                         | P      | dist_schedule clause not supported |
+| distribute parallel loop simd construct                    | P      | dist_schedule and linear clauses are not supported |
 | depend clause                                              | Y      | |
 | declare reduction construct                                | N      | |
 | atomic construct extensions                                | Y      | |
@@ -53,13 +53,13 @@ Note : No distinction is made between the support in Parser/Semantics, MLIR, Low
 | cancellation point construct                               | Y      | |
 | parallel do simd construct                                 | P      | linear clause not supported |
 | target teams construct                                     | P      | device clause not supported |
-| teams distribute construct                                 | P      | |
-| teams distribute simd construct                            | P      | linear clause is not supported |
-| target teams distribute construct                          | P      | device clause is not supported |
-| teams distribute parallel loop construct                   | P      | |
-| target teams distribute parallel loop construct            | P      | device clause is not supported |
-| teams distribute parallel loop simd construct              | P      | linear clause is not supported |
-| target teams distribute parallel loop simd construct       | P      | device and linear clauses are not supported |
+| teams distribute construct                                 | P      | dist_schedule clause not supported |
+| teams distribute simd construct                            | P      | dist_schedule and linear clauses are not supported |
+| target teams distribute construct                          | P      | device and dist_schedule clauses are not supported |
+| teams distribute parallel loop construct                   | P      | dist_schedule clause not supported |
+| target teams distribute parallel loop construct            | P      | device and dist_schedule clauses are not supported |
+| teams distribute parallel loop simd construct              | P      | dist_schedule and linear clauses are not supported |
+| target teams distribute parallel loop simd construct       | P      | device, dist_schedule and linear clauses are not supported |
 
 ## Extensions
 ### ATOMIC construct
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td
index da70048d28c12..ade00e7ca27d5 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMP.td
+++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td
@@ -490,8 +490,7 @@ def OMP_SCHEDULE_Dynamic : EnumVal<"dynamic", 3, 1> {}
 def OMP_SCHEDULE_Guided : EnumVal<"guided", 4, 1> {}
 def OMP_SCHEDULE_Auto : EnumVal<"auto", 5, 1> {}
 def OMP_SCHEDULE_Runtime : EnumVal<"runtime", 6, 1> {}
-def OMP_SCHEDULE_Distribute : EnumVal<"distribute", 7, 1> {}
-def OMP_SCHEDULE_Default : EnumVal<"default", 8, 0> { let isDefault = 1; }
+def OMP_SCHEDULE_Default : EnumVal<"default", 7, 0> { let isDefault = 1; }
 def OMPC_Schedule : Clause<[Spelling<"schedule">]> {
   let clangClass = "OMPScheduleClause";
   let flangClass = "OmpScheduleClause";
@@ -502,7 +501,6 @@ def OMPC_Schedule : Clause<[Spelling<"schedule">]> {
     OMP_SCHEDULE_Guided,
     OMP_SCHEDULE_Auto,
     OMP_SCHEDULE_Runtime,
-    OMP_SCHEDULE_Distribute,
     OMP_SCHEDULE_Default
   ];
 }
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
index 3efbdc4fe17d6..984bfeaaa3fad 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
@@ -1133,17 +1133,11 @@ class OpenMPIRBuilder {
   /// \param NeedsBarrier Indicates whether a barrier must be inserted after
   ///                     the loop.
   /// \param LoopType Type of workshare loop.
-  /// \param HasDistSchedule Defines if the clause being lowered is
-  /// dist_schedule as this is handled slightly differently
-  /// \param DistScheduleSchedType Defines the Schedule Type for the Distribute
-  /// loop. Defaults to None if no Distribute loop is present.
   ///
   /// \returns Point where to insert code after the workshare construct.
   InsertPointOrErrorTy applyStaticWorkshareLoop(
       DebugLoc DL, CanonicalLoopInfo *CLI, InsertPointTy AllocaIP,
-      omp::WorksharingLoopType LoopType, bool NeedsBarrier,
-      bool HasDistSchedule = false,
-      omp::OMPScheduleType DistScheduleSchedType = omp::OMPScheduleType::None);
+      omp::WorksharingLoopType LoopType, bool NeedsBarrier);
 
   /// Modifies the canonical loop a statically-scheduled workshare loop with a
   /// user-specified chunk size.
@@ -1156,22 +1150,13 @@ class OpenMPIRBuilder {
   /// \param NeedsBarrier Indicates whether a barrier must be inserted after the
   ///                     loop.
   /// \param ChunkSize    The user-specified chunk size.
-  /// \param SchedType    Optional type of scheduling to be passed to the init
-  /// function.
-  /// \param DistScheduleChunkSize    The size of dist_shcedule chunk considered
-  /// as a unit when
-  ///                 scheduling. If \p nullptr, defaults to 1.
-  /// \param DistScheduleSchedType Defines the Schedule Type for the Distribute
-  /// loop. Defaults to None if no Distribute loop is present.
   ///
   /// \returns Point where to insert code after the workshare construct.
-  InsertPointOrErrorTy applyStaticChunkedWorkshareLoop(
-      DebugLoc DL, CanonicalLoopInfo *CLI, InsertPointTy AllocaIP,
-      bool NeedsBarrier, Value *ChunkSize,
-      omp::OMPScheduleType SchedType =
-          omp::OMPScheduleType::UnorderedStaticChunked,
-      Value *DistScheduleChunkSize = nullptr,
-      omp::OMPScheduleType DistScheduleSchedType = omp::OMPScheduleType::None);
+  InsertPointOrErrorTy applyStaticChunkedWorkshareLoop(DebugLoc DL,
+                                                       CanonicalLoopInfo *CLI,
+                                                       InsertPointTy AllocaIP,
+                                                       bool NeedsBarrier,
+                                                       Value *ChunkSize);
 
   /// Modifies the canonical loop to be a dynamically-scheduled workshare loop.
   ///
@@ -1250,10 +1235,6 @@ class OpenMPIRBuilder {
   /// \param LoopType Information about type of loop worksharing.
   ///                 It corresponds to type of loop workshare OpenMP pragma.
   /// \param NoLoop If true, no-loop code is generated.
-  /// \param HasDistSchedule Defines if the clause being lowered is
-  /// dist_schedule as this is handled slightly differently
-  ///
-  /// \param DistScheduleChunkSize The chunk size for dist_schedule loop
   ///
   /// \returns Point where to insert code after the workshare construct.
   LLVM_ABI InsertPointOrErrorTy applyWorkshareLoop(
@@ -1265,8 +1246,7 @@ class OpenMPIRBuilder {
       bool HasOrderedClause = false,
       omp::WorksharingLoopType LoopType =
           omp::WorksharingLoopType::ForStaticLoop,
-      bool NoLoop = false, bool HasDistSchedule = false,
-      Value *DistScheduleChunkSize = nullptr);
+      bool NoLoop = false);
 
   /// Tile a loop nest.
   ///
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index be3053c34bc4e..26fdbadafe6a9 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -14,7 +14,6 @@
 
 #include "llvm/Frontend/OpenMP/OMPIRBuilder.h"
 #include "llvm/ADT/SmallBitVector.h"
-#include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/Analysis/AssumptionCache.h"
@@ -137,8 +136,6 @@ static bool isValidWorkshareLoopScheduleType(OMPScheduleType SchedType) {
   case OMPScheduleType::NomergeOrderedRuntime:
   case OMPScheduleType::NomergeOrderedAuto:
   case OMPScheduleType::NomergeOrderedTrapezoidal:
-  case OMPScheduleType::OrderedDistributeChunked:
-  case OMPScheduleType::OrderedDistribute:
     break;
   default:
     return false;
@@ -185,7 +182,7 @@ static const omp::GV &getGridValue(const Triple &T, Function *Kernel) {
 /// arguments.
 static OMPScheduleType
 getOpenMPBaseScheduleType(llvm::omp::ScheduleKind ClauseKind, bool HasChunks,
-                          bool HasSimdModifier, bool HasDistScheduleChunks) {
+                          bool HasSimdModifier) {
   // Currently, the default schedule it static.
   switch (ClauseKind) {
   case OMP_SCHEDULE_Default:
@@ -202,9 +199,6 @@ getOpenMPBaseScheduleType(llvm::omp::ScheduleKind ClauseKind, bool HasChunks,
   case OMP_SCHEDULE_Runtime:
     return HasSimdModifier ? OMPScheduleType::BaseRuntimeSimd
                            : OMPScheduleType::BaseRuntime;
-  case OMP_SCHEDULE_Distribute:
-    return HasDistScheduleChunks ? OMPScheduleType::BaseDistributeChunked
-                                 : OMPScheduleType::BaseDistribute;
   }
   llvm_unreachable("unhandled schedule clause argument");
 }
@@ -273,10 +267,9 @@ getOpenMPMonotonicityScheduleType(OMPScheduleType ScheduleType,
 static OMPScheduleType
 computeOpenMPScheduleType(ScheduleKind ClauseKind, bool HasChunks,
                           bool HasSimdModifier, bool HasMonotonicModifier,
-                          bool HasNonmonotonicModifier, bool HasOrderedClause,
-                          bool HasDistScheduleChunks) {
-  OMPScheduleType BaseSchedule = getOpenMPBaseScheduleType(
-      ClauseKind, HasChunks, HasSimdModifier, HasDistScheduleChunks);
+                          bool HasNonmonotonicModifier, bool HasOrderedClause) {
+  OMPScheduleType BaseSchedule =
+      getOpenMPBaseScheduleType(ClauseKind, HasChunks, HasSimdModifier);
   OMPScheduleType OrderedSchedule =
       getOpenMPOrderingScheduleType(BaseSchedule, HasOrderedClause);
   OMPScheduleType Result = getOpenMPMonotonicityScheduleType(
@@ -4810,8 +4803,7 @@ static FunctionCallee getKmpcForStaticInitForType(Type *Ty, Module &M,
 
 OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::applyStaticWorkshareLoop(
     DebugLoc DL, CanonicalLoopInfo *CLI, InsertPointTy AllocaIP,
-    WorksharingLoopType LoopType, bool NeedsBarrier, bool HasDistSchedule,
-    OMPScheduleType DistScheduleSchedType) {
+    WorksharingLoopType LoopType, bool NeedsBarrier) {
   assert(CLI->isValid() && "Requires a valid canonical loop");
   assert(!isConflictIP(AllocaIP, CLI->getPreheaderIP()) &&
          "Require dedicated allocate IP");
@@ -4867,29 +4859,15 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::applyStaticWorkshareLoop(
 
   // Call the "init" function and update the trip count of the loop with the
   // value it produced.
-  auto BuildInitCall = [LoopType, SrcLoc, ThreadNum, PLastIter, PLowerBound,
-                        PUpperBound, IVTy, PStride, One, Zero, StaticInit,
-                        this](Value *SchedulingType, auto &Builder) {
-    SmallVector<Value *, 10> Args({SrcLoc, ThreadNum, SchedulingType, PLastIter,
-                                   PLowerBound, PUpperBound});
-    if (LoopType == WorksharingLoopType::DistributeForStaticLoop) {
-      Value *PDistUpperBound =
-          Builder.CreateAlloca(IVTy, nullptr, "p.distupperbound");
-      Args.push_back(PDistUpperBound);
-    }
-    Args.append({PStride, One, Zero});
-    createRuntimeFunctionCall(StaticInit, Args);
-  };
-  BuildInitCall(SchedulingType, Builder);
-  if (HasDistSchedule &&
-      LoopType != WorksharingLoopType::DistributeStaticLoop) {
-    Constant *DistScheduleSchedType = ConstantInt::get(
-        I32Type, static_cast<int>(omp::OMPScheduleType::OrderedDistribute));
-    // We want to emit a second init function call for the dist_schedule clause
-    // to the Distribute construct. This should only be done however if a
-    // Workshare Loop is nested within a Distribute Construct
-    BuildInitCall(DistScheduleSchedType, Builder);
+  SmallVector<Value *, 10> Args(
+      {SrcLoc, ThreadNum, SchedulingType, PLastIter, PLowerBound, PUpperBound});
+  if (LoopType == WorksharingLoopType::DistributeForStaticLoop) {
+    Value *PDistUpperBound =
+        Builder.CreateAlloca(IVTy, nullptr, "p.distupperbound");
+    Args.push_back(PDistUpperBound);
   }
+  Args.append({PStride, One, Zero});
+  createRuntimeFunctionCall(StaticInit, Args);
   Value *LowerBound = Builder.CreateLoad(IVTy, PLowerBound);
   Value *InclusiveUpperBound = Builder.CreateLoad(IVTy, PUpperBound);
   Value *TripCountMinusOne = Builder.CreateSub(InclusiveUpperBound, LowerBound);
@@ -4928,44 +4906,14 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::applyStaticWorkshareLoop(
   return AfterIP;
 }
 
-static void addAccessGroupMetadata(BasicBlock *Block, MDNode *AccessGroup,
-                                   LoopInfo &LI);
-static void addLoopMetadata(CanonicalLoopInfo *Loop,
-                            ArrayRef<Metadata *> Properties);
-
-static void applyParallelAccessesMetadata(CanonicalLoopInfo *CLI,
-                                          LLVMContext &Ctx, Loop *Loop,
-                                          LoopInfo &LoopInfo,
-                                          SmallVector<Metadata *> &LoopMDList) {
-  SmallSet<BasicBlock *, 8> Reachable;
-
-  // Get the basic blocks from the loop in which memref instructions
-  // can be found.
-  // TODO: Generalize getting all blocks inside a CanonicalizeLoopInfo,
-  // preferably without running any passes.
-  for (BasicBlock *Block : Loop->getBlocks()) {
-    if (Block == CLI->getCond() || Block == CLI->getHeader())
-      continue;
-    Reachable.insert(Block);
-  }
-
-  // Add access group metadata to memory-access instructions.
-  MDNode *AccessGroup = MDNode::getDistinct(Ctx, {});
-  for (BasicBlock *BB : Reachable)
-    addAccessGroupMetadata(BB, AccessGroup, LoopInfo);
-  // TODO:  If the loop has existing parallel access metadata, have
-  // to combine two lists.
-  LoopMDList.push_back(MDNode::get(
-      Ctx, {MDString::get(Ctx, "llvm.loop.parallel_accesses"), AccessGroup}));
-}
-
 OpenMPIRBuilder::InsertPointOrErrorTy
-OpenMPIRBuilder::applyStaticChunkedWorkshareLoop(
-    DebugLoc DL, CanonicalLoopInfo *CLI, InsertPointTy AllocaIP,
-    bool NeedsBarrier, Value *ChunkSize, OMPScheduleType SchedType,
-    Value *DistScheduleChunkSize, OMPScheduleType DistScheduleSchedType) {
+OpenMPIRBuilder::applyStaticChunkedWorkshareLoop(DebugLoc DL,
+                                                 CanonicalLoopInfo *CLI,
+                                                 InsertPointTy AllocaIP,
+                                                 bool NeedsBarrier,
+                                                 Value *ChunkSize) {
   assert(CLI->isValid() && "Requires a valid canonical loop");
-  assert(ChunkSize || DistScheduleChunkSize && "Chunk size is required");
+  assert(ChunkSize && "Chunk size is required");
 
   LLVMContext &Ctx = CLI->getFunction()->getContext();
   Value *IV = CLI->getIndVar();
@@ -4979,18 +4927,6 @@ OpenMPIRBuilder::applyStaticChunkedWorkshareLoop(
   Constant *Zero = ConstantInt::get(InternalIVTy, 0);
   Constant *One = ConstantInt::get(InternalIVTy, 1);
 
-  Function *F = CLI->getFunction();
-  FunctionAnalysisManager FAM;
-  FAM.registerPass([]() { return DominatorTreeAnalysis(); });
-  FAM.registerPass([]() { return PassInstrumentationAnalysis(); });
-  LoopAnalysis LIA;
-  LoopInfo &&LI = LIA.run(*F, FAM);
-  Loop *L = LI.getLoopFor(CLI->getHeader());
-  SmallVector<Metadata *> LoopMDList;
-  if (ChunkSize || DistScheduleChunkSize)
-    applyParallelAccessesMetadata(CLI, Ctx, L, LI, LoopMDList);
-  addLoopMetadata(CLI, LoopMDList);
-
   // Declare useful OpenMP runtime functions.
   FunctionCallee StaticInit =
       getKmpcForStaticInitForType(InternalIVTy, M, *this);
@@ -5013,18 +4949,13 @@ OpenMPIRBuilder::applyStaticChunkedWorkshareLoop(
   Builder.SetCurrentDebugLocation(DL);
 
   // TODO: Detect overflow in ubsan or max-out with current tripcount.
-  Value *CastedChunkSize = Builder.CreateZExtOrTrunc(
-      ChunkSize ? ChunkSize : Zero, InternalIVTy, "chunksize");
-  Value *CastedDistScheduleChunkSize = Builder.CreateZExtOrTrunc(
-      DistScheduleChunkSize ? DistScheduleChunkSize : Zero, InternalIVTy,
-      "distschedulechunksize");
+  Value *CastedChunkSize =
+      Builder.CreateZExtOrTrunc(ChunkSize, InternalIVTy, "chunksize");
   Value *CastedTripCount =
       Builder.CreateZExt(OrigTripCount, InternalIVTy, "tripcount");
 
-  Constant *SchedulingType =
-      ConstantInt::get(I32Type, static_cast<int>(SchedType));
-  Constant *DistSchedulingType =
-      ConstantInt::get(I32Type, static_cast<int>(DistScheduleSchedType));
+  Constant *SchedulingType = ConstantInt::get(
+      I32Type, static_cast<int>(OMPScheduleType::UnorderedStaticChunked));
   Builder.CreateStore(Zero, PLowerBound);
   Value *OrigUpperBound = Builder.CreateSub(CastedTripCount, One);
   Builder.CreateStore(OrigUpperBound, PUpperBound);
@@ -5036,26 +4967,12 @@ OpenMPIRBuilder::applyStaticChunkedWorkshareLoop(
   Constant *SrcLocStr = getOrCreateSrcLocStr(DL, SrcLocStrSize);
   Value *SrcLoc = getOrCreateIdent(SrcLocStr, SrcLocStrSize);
   Value *ThreadNum = getOrCreateThreadID(SrcLoc);
-  auto BuildInitCall = [StaticInit, SrcLoc, ThreadNum, PLastIter, PLowerBound,
-                        PUpperBound, PStride, One,
-                        this](Value *SchedulingType, Value *ChunkSize,
-                              auto &Builder) {
-    createRuntimeFunctionCall(
-        StaticInit, {/*loc=*/SrcLoc, /*global_tid=*/ThreadNum,
-                     /*schedtype=*/SchedulingType, /*plastiter=*/PLastIter,
-                     /*plower=*/PLowerBound, /*pupper=*/PUpperBound,
-                     /*pstride=*/PStride, /*incr=*/One,
-                     /*chunk=*/ChunkSize});
-  };
-  BuildInitCall(SchedulingType, CastedChunkSize, Builder);
-  if (DistScheduleSchedType != OMPScheduleType::None &&
-      SchedType != OMPScheduleType::OrderedDistributeChunked &&
-      SchedType != OMPScheduleType::OrderedDistribute) {
-    // We want to emit a second init function call for the dist_schedule clause
-    // to the Distribute construct. This should only be done however if a
-    // Workshare Loop is nested within a Distribute Construct
-    BuildInitCall(DistSchedulingType, CastedDistScheduleChunkSize, Builder);
-  }
+  createRuntimeFunctionCall(
+      StaticInit, {/*loc=*/SrcLoc, /*global_tid=*/ThreadNum,
+                   /*schedtype=*/SchedulingType, /*plastiter=*/PLastIter,
+                   /*plower=*/PLowerBound, /*pupper=*/PUpperBound,
+                   /*pstride=*/PStride, /*incr=*/One,
+                   /*chunk=*/CastedChunkSize});
 
   // Load values written by the "init" function.
   Value *FirstChunkStart =
@@ -5382,47 +5299,31 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::applyWorkshareLoop(
     bool NeedsBarrier, omp::ScheduleKind SchedKind, Value *ChunkSize,
     bool HasSimdModifier, bool HasMonotonicModifier,
     bool HasNonmonotonicModifier, bool HasOrderedClause,
-    WorksharingLoopType LoopType, bool NoLoop, bool HasDistSchedule,
-    Value *DistScheduleChunkSize) {
+    WorksharingLoopType LoopType, bool NoLoop) {
   if (Config.isTargetDevice())
     return...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/169821


More information about the Mlir-commits mailing list