[Openmp-commits] [openmp] [OpenMP] Introduce support for OMPX extensions and taskgraph frontend (PR #66919)

via Openmp-commits openmp-commits at lists.llvm.org
Wed Sep 20 08:36:48 PDT 2023


https://github.com/Munesanz created https://github.com/llvm/llvm-project/pull/66919

This patch introduces initial support for OpenMP experimental directives and clauses through the "ompx" sentinel. The "taskgraph" ompx directive frontend is implemented as a use case to interface with the existing OpenMP host runtime support for record and replay.

The taskgraph directive generates an outlined region where instantiated tasks are recorded and replayed in subsequent executions, using the record & replay mechanism. The "__kmpc_taskgraph" function is implemented in the OpenMP runtime to wrap the existing "__kmpc_start_record_task" and "__kmpc_end_record_task" functions within the associated statement.

The macro "OMPX_TASKGRAPH," previously necessary for building the record and replay mechanism, has been removed, as the record and replay mechanism has been tested in previous patches. The tests have been updated to make use of the new taskgraph directive instead of relying on raw runtime calls.

>From 5c28c1a7a1777351e3b69411b30220983c0ebeba Mon Sep 17 00:00:00 2001
From: Adrian Munera <adrian.munera at bsc.es>
Date: Thu, 14 Sep 2023 10:57:22 +0000
Subject: [PATCH] [OpenMP] Introduce support for OMPX extensions and taskgraph
 frontend

This patch introduces initial support for OpenMP experimental directives and clauses through the "ompx" sentinel. The "taskgraph" ompx directive frontend is implemented as a use case to interface with the existing OpenMP host runtime support for record and replay.
The taskgraph directive generates an outlined region where instantiated tasks are recorded and replayed in subsequent executions, using the record & replay mechanism. The "__kmpc_taskgraph" function is implemented in the OpenMP runtime to wrap the existing "__kmpc_start_record_task" and "__kmpc_end_record_task" functions within the associated statement.
The macro "OMPX_TASKGRAPH," previously necessary for building the record and replay mechanism, has been removed, as the record and replay mechanism has been tested in previous patches. The tests have been updated to make use of the new taskgraph directive instead of relying on raw runtime calls.
---
 clang/include/clang-c/Index.h                 |   6 +-
 clang/include/clang/AST/RecursiveASTVisitor.h |   3 +
 clang/include/clang/AST/StmtOpenMP.h          |  49 ++++++
 .../clang/Basic/DiagnosticParseKinds.td       |   8 +
 clang/include/clang/Basic/StmtNodes.td        |   1 +
 clang/include/clang/Basic/TokenKinds.def      |   3 +
 clang/include/clang/Parse/Parser.h            |  12 +-
 clang/include/clang/Sema/Sema.h               |   4 +
 .../include/clang/Serialization/ASTBitCodes.h |   1 +
 clang/lib/AST/StmtOpenMP.cpp                  |  15 ++
 clang/lib/AST/StmtPrinter.cpp                 |   5 +
 clang/lib/AST/StmtProfile.cpp                 |   4 +
 clang/lib/Basic/OpenMPKinds.cpp               |   3 +
 clang/lib/CodeGen/CGOpenMPRuntime.cpp         |  74 +++++++++
 clang/lib/CodeGen/CGOpenMPRuntime.h           |   8 +
 clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp      |   2 +
 clang/lib/CodeGen/CGStmt.cpp                  |   3 +
 clang/lib/CodeGen/CGStmtOpenMP.cpp            |   6 +
 clang/lib/CodeGen/CodeGenFunction.h           |   1 +
 .../lib/Frontend/PrintPreprocessedOutput.cpp  |   7 +
 clang/lib/Parse/ParseCXXInlineMethods.cpp     |   7 +
 clang/lib/Parse/ParseDecl.cpp                 |   9 ++
 clang/lib/Parse/ParseDeclCXX.cpp              |  24 ++-
 clang/lib/Parse/ParseOpenMP.cpp               |  76 +++++++++-
 clang/lib/Parse/ParsePragma.cpp               |  75 ++++++++-
 clang/lib/Parse/ParseStmt.cpp                 |   1 +
 clang/lib/Parse/Parser.cpp                    |   6 +
 clang/lib/Sema/SemaExceptionSpec.cpp          |   1 +
 clang/lib/Sema/SemaOpenMP.cpp                 |  26 ++++
 clang/lib/Sema/TreeTransform.h                |  11 ++
 clang/lib/Serialization/ASTReaderStmt.cpp     |  10 ++
 clang/lib/Serialization/ASTWriter.cpp         |   1 +
 clang/lib/Serialization/ASTWriterStmt.cpp     |   6 +
 clang/lib/StaticAnalyzer/Core/ExprEngine.cpp  |   1 +
 .../test/OpenMP/ompx_extensions_messages.cpp  |  12 ++
 .../test/OpenMP/ompx_taskgraph_ast_print.cpp  |  34 +++++
 clang/test/OpenMP/ompx_taskgraph_codegen.cpp  |  29 ++++
 clang/tools/libclang/CIndex.cpp               |   2 +
 clang/tools/libclang/CXCursor.cpp             |   3 +
 .../llvm/Frontend/Directive/DirectiveBase.td  |   6 +
 llvm/include/llvm/Frontend/OpenMP/OMP.td      |   5 +
 .../include/llvm/Frontend/OpenMP/OMPKinds.def |   1 +
 llvm/include/llvm/TableGen/DirectiveEmitter.h |   2 +
 llvm/test/TableGen/directive1.td              |  22 +++
 llvm/test/TableGen/directive2.td              |  22 +++
 llvm/utils/TableGen/DirectiveEmitter.cpp      |  72 +++++++++
 openmp/runtime/CMakeLists.txt                 |   5 -
 openmp/runtime/src/kmp.h                      |  15 +-
 openmp/runtime/src/kmp_config.h.cmake         |   2 -
 openmp/runtime/src/kmp_global.cpp             |   4 +-
 openmp/runtime/src/kmp_settings.cpp           |   4 -
 openmp/runtime/src/kmp_taskdeps.cpp           |  16 +-
 openmp/runtime/src/kmp_taskdeps.h             |   4 -
 openmp/runtime/src/kmp_tasking.cpp            | 143 +++++++-----------
 openmp/runtime/test/CMakeLists.txt            |   1 -
 openmp/runtime/test/lit.cfg                   |   3 -
 openmp/runtime/test/lit.site.cfg.in           |   1 -
 .../test/tasking/omp_record_replay.cpp        |   7 +-
 .../test/tasking/omp_record_replay_deps.cpp   |   7 +-
 .../tasking/omp_record_replay_multiTDGs.cpp   |  12 +-
 .../tasking/omp_record_replay_print_dot.cpp   |  12 +-
 .../tasking/omp_record_replay_taskloop.cpp    |   7 +-
 62 files changed, 737 insertions(+), 185 deletions(-)
 create mode 100644 clang/test/OpenMP/ompx_extensions_messages.cpp
 create mode 100644 clang/test/OpenMP/ompx_taskgraph_ast_print.cpp
 create mode 100644 clang/test/OpenMP/ompx_taskgraph_codegen.cpp

diff --git a/clang/include/clang-c/Index.h b/clang/include/clang-c/Index.h
index 1b91feabd584c50..ce5688b82920255 100644
--- a/clang/include/clang-c/Index.h
+++ b/clang/include/clang-c/Index.h
@@ -2140,7 +2140,11 @@ enum CXCursorKind {
    */
   CXCursor_OMPScopeDirective = 306,
 
-  CXCursor_LastStmt = CXCursor_OMPScopeDirective,
+  /** OpenMP taskgraph directive.
+   */
+  CXCursor_OMPTaskgraphDirective = 307,
+
+  CXCursor_LastStmt = CXCursor_OMPTaskgraphDirective,
 
   /**
    * Cursor that represents the translation unit itself.
diff --git a/clang/include/clang/AST/RecursiveASTVisitor.h b/clang/include/clang/AST/RecursiveASTVisitor.h
index d4146d52893ffb1..49fe2ea4070bddf 100644
--- a/clang/include/clang/AST/RecursiveASTVisitor.h
+++ b/clang/include/clang/AST/RecursiveASTVisitor.h
@@ -3027,6 +3027,9 @@ DEF_TRAVERSE_STMT(OMPBarrierDirective,
 DEF_TRAVERSE_STMT(OMPTaskwaitDirective,
                   { TRY_TO(TraverseOMPExecutableDirective(S)); })
 
+DEF_TRAVERSE_STMT(OMPTaskgraphDirective,
+                  { TRY_TO(TraverseOMPExecutableDirective(S)); })
+
 DEF_TRAVERSE_STMT(OMPTaskgroupDirective,
                   { TRY_TO(TraverseOMPExecutableDirective(S)); })
 
diff --git a/clang/include/clang/AST/StmtOpenMP.h b/clang/include/clang/AST/StmtOpenMP.h
index 2725747e051e728..4eba370ad7d362d 100644
--- a/clang/include/clang/AST/StmtOpenMP.h
+++ b/clang/include/clang/AST/StmtOpenMP.h
@@ -2729,6 +2729,55 @@ class OMPTaskwaitDirective : public OMPExecutableDirective {
   }
 };
 
+/// This represents '#pragma ompx taskgraph' directive.
+/// Available with OMPX extensions.
+///
+/// \code
+/// #pragma ompx taskgraph
+/// \endcode
+///
+class OMPTaskgraphDirective : public OMPExecutableDirective {
+  friend class ASTStmtReader;
+  friend class OMPExecutableDirective;
+  /// Build directive with the given start and end location.
+  ///
+  /// \param StartLoc Starting location of the directive kind.
+  /// \param EndLoc Ending location of the directive.
+  ///
+  OMPTaskgraphDirective(SourceLocation StartLoc, SourceLocation EndLoc)
+      : OMPExecutableDirective(OMPTaskgraphDirectiveClass,
+                               llvm::omp::OMPD_taskgraph, StartLoc, EndLoc) {}
+
+  /// Build an empty directive.
+  ///
+  explicit OMPTaskgraphDirective()
+      : OMPExecutableDirective(OMPTaskgraphDirectiveClass,
+                               llvm::omp::OMPD_taskgraph, SourceLocation(),
+                               SourceLocation()) {}
+
+public:
+  /// Creates directive.
+  ///
+  /// \param C AST context.
+  /// \param StartLoc Starting location of the directive kind.
+  /// \param EndLoc Ending Location of the directive.
+  ///
+  static OMPTaskgraphDirective *
+  Create(const ASTContext &C, SourceLocation StartLoc, SourceLocation EndLoc,
+         ArrayRef<OMPClause *> Clauses, Stmt *AssociatedStmt);
+
+  /// Creates an empty directive.
+  ///
+  /// \param C AST context.
+  ///
+  static OMPTaskgraphDirective *CreateEmpty(const ASTContext &C,
+                                            unsigned NumClauses, EmptyShell);
+
+  static bool classof(const Stmt *T) {
+    return T->getStmtClass() == OMPTaskgraphDirectiveClass;
+  }
+};
+
 /// This represents '#pragma omp taskgroup' directive.
 ///
 /// \code
diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td b/clang/include/clang/Basic/DiagnosticParseKinds.td
index 178761bdcf4d5e3..f96715abeb58105 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1176,6 +1176,8 @@ def warn_pragma_ms_fenv_access : Warning<
 def warn_pragma_extra_tokens_at_eol : Warning<
   "extra tokens at end of '#pragma %0' - ignored">,
   InGroup<IgnoredPragmas>;
+def err_omp_extension_without_ompx : Error<
+  "Using extension directive '%0' in #pragma omp instead of #pragma ompx">;
 def warn_pragma_expected_comma : Warning<
   "expected ',' in '#pragma %0'">, InGroup<IgnoredPragmas>;
 def warn_pragma_expected_punc : Warning<
@@ -1400,6 +1402,12 @@ def warn_omp_unknown_assumption_clause_missing_id
 def warn_omp_unknown_assumption_clause_without_args
     : Warning<"%0 clause should not be followed by arguments; tokens will be ignored">,
       InGroup<OpenMPClauses>;
+def warn_omp_extension_directive_not_enabled
+    : Warning<"OpenMP Extensions not enabled. Ignoring OpenMP Extension Directive '#pragma ompx %0'">,
+      InGroup<IgnoredPragmas>;
+def warn_omp_extension_clause_not_enabled
+    : Warning<"OpenMP Extensions not enabled. Ignoring OpenMP Extension Clause '%0'">,
+      InGroup<IgnoredPragmas>;
 def note_omp_assumption_clause_continue_here
     : Note<"the ignored tokens spans until here">;
 def err_omp_declare_target_unexpected_clause: Error<
diff --git a/clang/include/clang/Basic/StmtNodes.td b/clang/include/clang/Basic/StmtNodes.td
index cec301dfca2817b..10f3ed16a06259d 100644
--- a/clang/include/clang/Basic/StmtNodes.td
+++ b/clang/include/clang/Basic/StmtNodes.td
@@ -243,6 +243,7 @@ def OMPTaskDirective : StmtNode<OMPExecutableDirective>;
 def OMPTaskyieldDirective : StmtNode<OMPExecutableDirective>;
 def OMPBarrierDirective : StmtNode<OMPExecutableDirective>;
 def OMPTaskwaitDirective : StmtNode<OMPExecutableDirective>;
+def OMPTaskgraphDirective : StmtNode<OMPExecutableDirective>;
 def OMPTaskgroupDirective : StmtNode<OMPExecutableDirective>;
 def OMPFlushDirective : StmtNode<OMPExecutableDirective>;
 def OMPDepobjDirective : StmtNode<OMPExecutableDirective>;
diff --git a/clang/include/clang/Basic/TokenKinds.def b/clang/include/clang/Basic/TokenKinds.def
index 72e8df8c793a7b6..d4e30368a985780 100644
--- a/clang/include/clang/Basic/TokenKinds.def
+++ b/clang/include/clang/Basic/TokenKinds.def
@@ -937,10 +937,13 @@ PRAGMA_ANNOTATION(pragma_opencl_extension)
 // distinguish between a real pragma and a converted pragma. It is not marked
 // as a PRAGMA_ANNOTATION because it doesn't get generated from a #pragma.
 ANNOTATION(attr_openmp)
+ANNOTATION(attr_openmp_extension)
 // The lexer produces these so that they only take effect when the parser
 // handles #pragma omp ... directives.
 PRAGMA_ANNOTATION(pragma_openmp)
 PRAGMA_ANNOTATION(pragma_openmp_end)
+// For support of OpenMP extensions. These tokens handle #pragma ompx ... directives
+PRAGMA_ANNOTATION(pragma_openmp_extension)
 
 // Annotations for loop pragma directives #pragma clang loop ...
 // The lexer produces these so that they only take effect when the parser
diff --git a/clang/include/clang/Parse/Parser.h b/clang/include/clang/Parse/Parser.h
index f599b8b98d031fb..e8c014dfd0bf7a0 100644
--- a/clang/include/clang/Parse/Parser.h
+++ b/clang/include/clang/Parse/Parser.h
@@ -175,6 +175,7 @@ class Parser : public CodeCompletionHandler {
   std::unique_ptr<PragmaHandler> FPContractHandler;
   std::unique_ptr<PragmaHandler> OpenCLExtensionHandler;
   std::unique_ptr<PragmaHandler> OpenMPHandler;
+  std::unique_ptr<PragmaHandler> OpenMPXHandler;
   std::unique_ptr<PragmaHandler> PCSectionHandler;
   std::unique_ptr<PragmaHandler> MSCommentHandler;
   std::unique_ptr<PragmaHandler> MSDetectMismatchHandler;
@@ -2894,7 +2895,8 @@ class Parser : public CodeCompletionHandler {
   }
 
   void ParseOpenMPAttributeArgs(IdentifierInfo *AttrName,
-                                CachedTokens &OpenMPTokens);
+                                CachedTokens &OpenMPTokens,
+                                bool isOpenMPExtension);
 
   void ParseCXX11AttributeSpecifierInternal(ParsedAttributes &Attrs,
                                             CachedTokens &OpenMPTokens,
@@ -3400,6 +3402,14 @@ class Parser : public CodeCompletionHandler {
       const llvm::function_ref<void(CXXScopeSpec &, DeclarationNameInfo)> &
           Callback,
       bool AllowScopeSpecifier);
+
+  /// Check if clause is extension and extensions are enabled.
+  ///
+  /// \param Kind Kind of the clause
+  /// \param Loc Location of the clause
+  ///
+  bool CheckOpenMPClauseExtension(OpenMPClauseKind Kind, SourceLocation Loc);
+
   /// Parses declarative or executable directive.
   ///
   /// \param StmtCtx The context in which we're parsing the directive.
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 47379e00a7445e3..9ef6d2a7c9ca442 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -11651,6 +11651,10 @@ class Sema final {
   StmtResult ActOnOpenMPTaskwaitDirective(ArrayRef<OMPClause *> Clauses,
                                           SourceLocation StartLoc,
                                           SourceLocation EndLoc);
+  /// Called on well-formed '\#pragma ompx taskgraph'.
+  StmtResult ActOnOpenMPTaskgraphDirective(ArrayRef<OMPClause *> Clauses,
+                                           Stmt *AStmt, SourceLocation StartLoc,
+                                           SourceLocation EndLoc);
   /// Called on well-formed '\#pragma omp taskgroup'.
   StmtResult ActOnOpenMPTaskgroupDirective(ArrayRef<OMPClause *> Clauses,
                                            Stmt *AStmt, SourceLocation StartLoc,
diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h
index 9e115f2a5cce3f9..b6cab03730d284e 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -1954,6 +1954,7 @@ enum StmtCode {
   STMT_OMP_ERROR_DIRECTIVE,
   STMT_OMP_BARRIER_DIRECTIVE,
   STMT_OMP_TASKWAIT_DIRECTIVE,
+  STMT_OMP_TASKGRAPH_DIRECTIVE,
   STMT_OMP_FLUSH_DIRECTIVE,
   STMT_OMP_DEPOBJ_DIRECTIVE,
   STMT_OMP_SCAN_DIRECTIVE,
diff --git a/clang/lib/AST/StmtOpenMP.cpp b/clang/lib/AST/StmtOpenMP.cpp
index 426b35848cb5c89..1978017a021b725 100644
--- a/clang/lib/AST/StmtOpenMP.cpp
+++ b/clang/lib/AST/StmtOpenMP.cpp
@@ -804,6 +804,21 @@ OMPTaskwaitDirective *OMPTaskwaitDirective::CreateEmpty(const ASTContext &C,
   return createEmptyDirective<OMPTaskwaitDirective>(C, NumClauses);
 }
 
+OMPTaskgraphDirective *OMPTaskgraphDirective::Create(
+    const ASTContext &C, SourceLocation StartLoc, SourceLocation EndLoc,
+    ArrayRef<OMPClause *> Clauses, Stmt *AssociatedStmt) {
+  auto *Dir = createDirective<OMPTaskgraphDirective>(
+      C, Clauses, AssociatedStmt, /*NumChildren=*/1, StartLoc, EndLoc);
+  return Dir;
+}
+
+OMPTaskgraphDirective *OMPTaskgraphDirective::CreateEmpty(const ASTContext &C,
+                                                          unsigned NumClauses,
+                                                          EmptyShell) {
+  return createEmptyDirective<OMPTaskgraphDirective>(
+      C, NumClauses, /*HasAssociatedStmt=*/true, /*NumChildren=*/1);
+}
+
 OMPTaskgroupDirective *OMPTaskgroupDirective::Create(
     const ASTContext &C, SourceLocation StartLoc, SourceLocation EndLoc,
     ArrayRef<OMPClause *> Clauses, Stmt *AssociatedStmt, Expr *ReductionRef) {
diff --git a/clang/lib/AST/StmtPrinter.cpp b/clang/lib/AST/StmtPrinter.cpp
index a31aa0cfeeed8de..cd1793edf685ab9 100644
--- a/clang/lib/AST/StmtPrinter.cpp
+++ b/clang/lib/AST/StmtPrinter.cpp
@@ -854,6 +854,11 @@ void StmtPrinter::VisitOMPTaskwaitDirective(OMPTaskwaitDirective *Node) {
   PrintOMPExecutableDirective(Node);
 }
 
+void StmtPrinter::VisitOMPTaskgraphDirective(OMPTaskgraphDirective *Node) {
+  Indent() << "#pragma ompx taskgraph";
+  PrintOMPExecutableDirective(Node);
+}
+
 void StmtPrinter::VisitOMPErrorDirective(OMPErrorDirective *Node) {
   Indent() << "#pragma omp error";
   PrintOMPExecutableDirective(Node);
diff --git a/clang/lib/AST/StmtProfile.cpp b/clang/lib/AST/StmtProfile.cpp
index 27f71edd6f99b32..1fd98118c8eaab0 100644
--- a/clang/lib/AST/StmtProfile.cpp
+++ b/clang/lib/AST/StmtProfile.cpp
@@ -1054,6 +1054,10 @@ void StmtProfiler::VisitOMPTaskwaitDirective(const OMPTaskwaitDirective *S) {
   VisitOMPExecutableDirective(S);
 }
 
+void StmtProfiler::VisitOMPTaskgraphDirective(const OMPTaskgraphDirective *S) {
+  VisitOMPExecutableDirective(S);
+}
+
 void StmtProfiler::VisitOMPErrorDirective(const OMPErrorDirective *S) {
   VisitOMPExecutableDirective(S);
 }
diff --git a/clang/lib/Basic/OpenMPKinds.cpp b/clang/lib/Basic/OpenMPKinds.cpp
index 86de067da134a0a..05221ac547ee722 100644
--- a/clang/lib/Basic/OpenMPKinds.cpp
+++ b/clang/lib/Basic/OpenMPKinds.cpp
@@ -834,6 +834,9 @@ void clang::getOpenMPCaptureRegions(
     CaptureRegions.push_back(OMPD_teams);
     CaptureRegions.push_back(OMPD_parallel);
     break;
+  case OMPD_taskgraph:
+    CaptureRegions.push_back(OMPD_taskgraph);
+    break;
   case OMPD_nothing:
     CaptureRegions.push_back(OMPD_nothing);
     break;
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index 92b7c8d4aa546f0..fad9c70a935f672 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -60,6 +60,8 @@ class CGOpenMPRegionInfo : public CodeGenFunction::CGCapturedStmtInfo {
     ParallelOutlinedRegion,
     /// Region with outlined function for standalone 'task' directive.
     TaskOutlinedRegion,
+    /// Region with outlined function for standalone 'taskgraph' directive.
+    TaskgraphOutlinedRegion,
     /// Region for constructs that do not require function outlining,
     /// like 'for', 'sections', 'atomic' etc. directives.
     InlinedRegion,
@@ -234,6 +236,26 @@ class CGOpenMPTaskOutlinedRegionInfo final : public CGOpenMPRegionInfo {
   const UntiedTaskActionTy &Action;
 };
 
+/// API for captured statement code generation in OpenMP taskgraphs.
+class CGOpenMPTaskgraphRegionInfo final : public CGOpenMPRegionInfo {
+public:
+  CGOpenMPTaskgraphRegionInfo(const CapturedStmt &CS,
+                              const RegionCodeGenTy &CodeGen)
+      : CGOpenMPRegionInfo(CS, TaskgraphOutlinedRegion, CodeGen,
+                           llvm::omp::OMPD_taskgraph, false) {}
+
+  const VarDecl *getThreadIDVariable() const override { return 0; }
+
+  /// Get the name of the capture helper.
+  StringRef getHelperName() const override { return "taskgraph.omp_outlined."; }
+
+  static bool classof(const CGCapturedStmtInfo *Info) {
+    return CGOpenMPRegionInfo::classof(Info) &&
+           cast<CGOpenMPRegionInfo>(Info)->getRegionKind() ==
+               TaskgraphOutlinedRegion;
+  }
+};
+
 /// API for inlined captured statement code generation in OpenMP
 /// constructs.
 class CGOpenMPInlinedRegionInfo : public CGOpenMPRegionInfo {
@@ -5779,6 +5801,48 @@ void CGOpenMPRuntime::emitTaskwaitCall(CodeGenFunction &CGF, SourceLocation Loc,
     Region->emitUntiedSwitch(CGF);
 }
 
+void CGOpenMPRuntime::emitTaskgraphCall(CodeGenFunction &CGF,
+                                        SourceLocation Loc,
+                                        const OMPExecutableDirective &D) {
+  if (!CGF.HaveInsertPoint())
+    return;
+
+  // Building kmp_taskgraph_flags_t flags for kmpc_taskgraph. C.f., kmp.h
+  enum {
+    NowaitFlag = 0x1, // Not used yet.
+    ReRecordFlag = 0x2,
+  };
+
+  unsigned Flags = 0;
+
+  CodeGenFunction OutlinedCGF(CGM, true);
+
+  const CapturedStmt *CS = cast<CapturedStmt>(D.getAssociatedStmt());
+
+  auto &&BodyGen = [CS](CodeGenFunction &CGF, PrePostActionTy &) {
+    CGF.EmitStmt(CS->getCapturedStmt());
+  };
+
+  LValue CapStruct = CGF.InitCapturedStruct(*CS);
+  CGOpenMPTaskgraphRegionInfo TaskgraphRegion(*CS, BodyGen);
+  CodeGenFunction::CGCapturedStmtRAII CapInfoRAII(OutlinedCGF,
+                                                  &TaskgraphRegion);
+  llvm::Function *FnT = OutlinedCGF.GenerateCapturedStmtFunction(*CS);
+
+  std::vector<llvm::Value *> Args{
+      emitUpdateLocation(CGF, Loc),
+      getThreadID(CGF, Loc),
+      CGF.Builder.getInt32(Flags),
+      CGF.Builder.getInt32(D.getBeginLoc().getHashValue()),
+      CGF.Builder.CreatePointerBitCastOrAddrSpaceCast(FnT, CGM.VoidPtrTy),
+      CGF.Builder.CreatePointerBitCastOrAddrSpaceCast(
+          CapStruct.getPointer(OutlinedCGF), CGM.VoidPtrTy)};
+
+  CGF.EmitRuntimeCall(OMPBuilder.getOrCreateRuntimeFunction(
+                          CGM.getModule(), OMPRTL___kmpc_taskgraph),
+                      Args);
+}
+
 void CGOpenMPRuntime::emitInlinedDirective(CodeGenFunction &CGF,
                                            OpenMPDirectiveKind InnerKind,
                                            const RegionCodeGenTy &CodeGen,
@@ -6192,6 +6256,7 @@ const Expr *CGOpenMPRuntime::getNumTeamsExprForTargetDirective(
   case OMPD_taskyield:
   case OMPD_barrier:
   case OMPD_taskwait:
+  case OMPD_taskgraph:
   case OMPD_taskgroup:
   case OMPD_atomic:
   case OMPD_flush:
@@ -8955,6 +9020,7 @@ getNestedDistributeDirective(ASTContext &Ctx, const OMPExecutableDirective &D) {
     case OMPD_taskyield:
     case OMPD_barrier:
     case OMPD_taskwait:
+    case OMPD_taskgraph:
     case OMPD_taskgroup:
     case OMPD_atomic:
     case OMPD_flush:
@@ -9814,6 +9880,7 @@ void CGOpenMPRuntime::scanForTargetRegionsFunctions(const Stmt *S,
     case OMPD_taskyield:
     case OMPD_barrier:
     case OMPD_taskwait:
+    case OMPD_taskgraph:
     case OMPD_taskgroup:
     case OMPD_atomic:
     case OMPD_flush:
@@ -10419,6 +10486,7 @@ void CGOpenMPRuntime::emitTargetDataStandAloneCall(
     case OMPD_taskyield:
     case OMPD_barrier:
     case OMPD_taskwait:
+    case OMPD_taskgraph:
     case OMPD_taskgroup:
     case OMPD_atomic:
     case OMPD_flush:
@@ -12156,6 +12224,12 @@ void CGOpenMPSIMDRuntime::emitTaskwaitCall(CodeGenFunction &CGF,
   llvm_unreachable("Not supported in SIMD-only mode");
 }
 
+void CGOpenMPSIMDRuntime::emitTaskgraphCall(CodeGenFunction &CGF,
+                                            SourceLocation Loc,
+                                            const OMPExecutableDirective &D) {
+  llvm_unreachable("Not supported in SIMD-only mode");
+}
+
 void CGOpenMPSIMDRuntime::emitCancellationPointCall(
     CodeGenFunction &CGF, SourceLocation Loc,
     OpenMPDirectiveKind CancelRegion) {
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h b/clang/lib/CodeGen/CGOpenMPRuntime.h
index 74b528d6cd7f8cc..3a81852c5c3596a 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.h
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.h
@@ -1332,6 +1332,10 @@ class CGOpenMPRuntime {
   virtual void emitTaskwaitCall(CodeGenFunction &CGF, SourceLocation Loc,
                                 const OMPTaskDataTy &Data);
 
+  /// Emit code for 'taskgraph' directive.
+  virtual void emitTaskgraphCall(CodeGenFunction &CGF, SourceLocation Loc,
+                                 const OMPExecutableDirective &D);
+
   /// Emit code for 'cancellation point' construct.
   /// \param CancelRegion Region kind for which the cancellation point must be
   /// emitted.
@@ -2137,6 +2141,10 @@ class CGOpenMPSIMDRuntime final : public CGOpenMPRuntime {
   void emitTaskwaitCall(CodeGenFunction &CGF, SourceLocation Loc,
                         const OMPTaskDataTy &Data) override;
 
+  /// Emit code for 'taskgraph' directive.
+  void emitTaskgraphCall(CodeGenFunction &CGF, SourceLocation Loc,
+                         const OMPExecutableDirective &D) override;
+
   /// Emit code for 'cancellation point' construct.
   /// \param CancelRegion Region kind for which the cancellation point must be
   /// emitted.
diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
index 93819ab815add08..6c14dd918333266 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -618,6 +618,7 @@ static bool hasNestedSPMDDirective(ASTContext &Ctx,
     case OMPD_taskyield:
     case OMPD_barrier:
     case OMPD_taskwait:
+    case OMPD_taskgraph:
     case OMPD_taskgroup:
     case OMPD_atomic:
     case OMPD_flush:
@@ -701,6 +702,7 @@ static bool supportsSPMDExecutionMode(ASTContext &Ctx,
   case OMPD_taskyield:
   case OMPD_barrier:
   case OMPD_taskwait:
+  case OMPD_taskgraph:
   case OMPD_taskgroup:
   case OMPD_atomic:
   case OMPD_flush:
diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp
index 6674aa2409a5947..0ca1e159ce38e09 100644
--- a/clang/lib/CodeGen/CGStmt.cpp
+++ b/clang/lib/CodeGen/CGStmt.cpp
@@ -266,6 +266,9 @@ void CodeGenFunction::EmitStmt(const Stmt *S, ArrayRef<const Attr *> Attrs) {
   case Stmt::OMPTaskwaitDirectiveClass:
     EmitOMPTaskwaitDirective(cast<OMPTaskwaitDirective>(*S));
     break;
+  case Stmt::OMPTaskgraphDirectiveClass:
+    EmitOMPTaskgraphDirective(cast<OMPTaskgraphDirective>(*S));
+    break;
   case Stmt::OMPTaskgroupDirectiveClass:
     EmitOMPTaskgroupDirective(cast<OMPTaskgroupDirective>(*S));
     break;
diff --git a/clang/lib/CodeGen/CGStmtOpenMP.cpp b/clang/lib/CodeGen/CGStmtOpenMP.cpp
index a4e80a4a9e1fd75..78d52190aaa0ceb 100644
--- a/clang/lib/CodeGen/CGStmtOpenMP.cpp
+++ b/clang/lib/CodeGen/CGStmtOpenMP.cpp
@@ -1350,6 +1350,7 @@ void CodeGenFunction::EmitOMPReductionClauseInit(
     case OMPD_error:
     case OMPD_barrier:
     case OMPD_taskwait:
+    case OMPD_taskgraph:
     case OMPD_taskgroup:
     case OMPD_flush:
     case OMPD_depobj:
@@ -5311,6 +5312,11 @@ void CodeGenFunction::EmitOMPTaskwaitDirective(const OMPTaskwaitDirective &S) {
   CGM.getOpenMPRuntime().emitTaskwaitCall(*this, S.getBeginLoc(), Data);
 }
 
+void CodeGenFunction::EmitOMPTaskgraphDirective(
+    const OMPTaskgraphDirective &S) {
+  CGM.getOpenMPRuntime().emitTaskgraphCall(*this, S.getBeginLoc(), S);
+}
+
 bool isSupportedByOpenMPIRBuilder(const OMPTaskgroupDirective &T) {
   return T.clauses().empty();
 }
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index 60f2f21de53ab98..cbc31a612a657f4 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -3540,6 +3540,7 @@ class CodeGenFunction : public CodeGenTypeCache {
   void EmitOMPErrorDirective(const OMPErrorDirective &S);
   void EmitOMPBarrierDirective(const OMPBarrierDirective &S);
   void EmitOMPTaskwaitDirective(const OMPTaskwaitDirective &S);
+  void EmitOMPTaskgraphDirective(const OMPTaskgraphDirective &S);
   void EmitOMPTaskgroupDirective(const OMPTaskgroupDirective &S);
   void EmitOMPFlushDirective(const OMPFlushDirective &S);
   void EmitOMPDepobjDirective(const OMPDepobjDirective &S);
diff --git a/clang/lib/Frontend/PrintPreprocessedOutput.cpp b/clang/lib/Frontend/PrintPreprocessedOutput.cpp
index 1b262d9e6f7cb3b..ff75ecc63cfbe53 100644
--- a/clang/lib/Frontend/PrintPreprocessedOutput.cpp
+++ b/clang/lib/Frontend/PrintPreprocessedOutput.cpp
@@ -1000,7 +1000,13 @@ void clang::DoPrintPreprocessedInput(Preprocessor &PP, raw_ostream *OS,
   std::unique_ptr<UnknownPragmaHandler> OpenMPHandler(
       new UnknownPragmaHandler("#pragma omp", Callbacks,
                                /*RequireTokenExpansion=*/true));
+
+  std::unique_ptr<UnknownPragmaHandler> OpenMPXHandler(
+      new UnknownPragmaHandler("#pragma ompx", Callbacks,
+                               /*RequireTokenExpansion=*/true));
+
   PP.AddPragmaHandler("omp", OpenMPHandler.get());
+  PP.AddPragmaHandler("ompx", OpenMPXHandler.get());
 
   PP.addPPCallbacks(std::unique_ptr<PPCallbacks>(Callbacks));
 
@@ -1037,4 +1043,5 @@ void clang::DoPrintPreprocessedInput(Preprocessor &PP, raw_ostream *OS,
   PP.RemovePragmaHandler("GCC", GCCHandler.get());
   PP.RemovePragmaHandler("clang", ClangHandler.get());
   PP.RemovePragmaHandler("omp", OpenMPHandler.get());
+  PP.RemovePragmaHandler("ompx", OpenMPXHandler.get());
 }
diff --git a/clang/lib/Parse/ParseCXXInlineMethods.cpp b/clang/lib/Parse/ParseCXXInlineMethods.cpp
index 573c90a36eeab36..63efe69fa91280d 100644
--- a/clang/lib/Parse/ParseCXXInlineMethods.cpp
+++ b/clang/lib/Parse/ParseCXXInlineMethods.cpp
@@ -807,6 +807,13 @@ void Parser::ParseLexedPragma(LateParsedPragma &LP) {
     (void)ParseOpenMPDeclarativeDirectiveWithExtDecl(AS, Attrs);
     break;
   }
+  case tok::annot_attr_openmp_extension:
+  case tok::annot_pragma_openmp_extension: {
+    AccessSpecifier AS = LP.getAccessSpecifier();
+    ParsedAttributes Attrs(AttrFactory);
+    (void)ParseOpenMPDeclarativeDirectiveWithExtDecl(AS, Attrs);
+    break;
+  }
   default:
     llvm_unreachable("Unexpected token.");
   }
diff --git a/clang/lib/Parse/ParseDecl.cpp b/clang/lib/Parse/ParseDecl.cpp
index 748b9d53c9f5b33..769267030dbd70e 100644
--- a/clang/lib/Parse/ParseDecl.cpp
+++ b/clang/lib/Parse/ParseDecl.cpp
@@ -4737,6 +4737,15 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,
       continue;
     }
 
+    if (Tok.isOneOf(tok::annot_pragma_openmp_extension,
+                    tok::annot_attr_openmp_extension)) {
+      // Result can be ignored, because it must be always empty.
+      AccessSpecifier AS = AS_none;
+      ParsedAttributes Attrs(AttrFactory);
+      (void)ParseOpenMPDeclarativeDirectiveWithExtDecl(AS, Attrs);
+      continue;
+    }
+
     if (tok::isPragmaAnnotation(Tok.getKind())) {
       Diag(Tok.getLocation(), diag::err_pragma_misplaced_in_decl)
           << DeclSpec::getSpecifierName(
diff --git a/clang/lib/Parse/ParseDeclCXX.cpp b/clang/lib/Parse/ParseDeclCXX.cpp
index 5fe9abb1fdcab30..40587a51a16f242 100644
--- a/clang/lib/Parse/ParseDeclCXX.cpp
+++ b/clang/lib/Parse/ParseDeclCXX.cpp
@@ -2786,7 +2786,7 @@ Parser::ParseCXXClassMemberDeclaration(AccessSpecifier AS,
   // The next token may be an OpenMP pragma annotation token. That would
   // normally be handled from ParseCXXClassMemberDeclarationWithPragmas, but in
   // this case, it came from an *attribute* rather than a pragma. Handle it now.
-  if (Tok.is(tok::annot_attr_openmp))
+  if (Tok.isOneOf(tok::annot_attr_openmp, tok::annot_attr_openmp_extension))
     return ParseOpenMPDeclarativeDirectiveWithExtDecl(AS, DeclAttrs);
 
   if (Tok.is(tok::kw_using)) {
@@ -3419,6 +3419,8 @@ Parser::DeclGroupPtrTy Parser::ParseCXXClassMemberDeclarationWithPragmas(
 
   case tok::annot_attr_openmp:
   case tok::annot_pragma_openmp:
+  case tok::annot_attr_openmp_extension:
+  case tok::annot_pragma_openmp_extension:
     return ParseOpenMPDeclarativeDirectiveWithExtDecl(
         AS, AccessAttrs, /*Delayed=*/true, TagType, TagDecl);
 
@@ -4295,7 +4297,8 @@ Parser::TryParseCXX11AttributeIdentifier(SourceLocation &Loc,
 }
 
 void Parser::ParseOpenMPAttributeArgs(IdentifierInfo *AttrName,
-                                      CachedTokens &OpenMPTokens) {
+                                      CachedTokens &OpenMPTokens,
+                                      bool isOpenMPExtension) {
   // Both 'sequence' and 'directive' attributes require arguments, so parse the
   // open paren for the argument list.
   BalancedDelimiterTracker T(*this, tok::l_paren);
@@ -4310,7 +4313,10 @@ void Parser::ParseOpenMPAttributeArgs(IdentifierInfo *AttrName,
     // pragma directive.
     Token OMPBeginTok;
     OMPBeginTok.startToken();
-    OMPBeginTok.setKind(tok::annot_attr_openmp);
+    if (isOpenMPExtension)
+      OMPBeginTok.setKind(tok::annot_attr_openmp_extension);
+    else
+      OMPBeginTok.setKind(tok::annot_attr_openmp);
     OMPBeginTok.setLocation(Tok.getLocation());
     OpenMPTokens.push_back(OMPBeginTok);
 
@@ -4336,8 +4342,12 @@ void Parser::ParseOpenMPAttributeArgs(IdentifierInfo *AttrName,
 
       // If there is an identifier and it is 'omp', a double colon is required
       // followed by the actual identifier we're after.
-      if (Ident && Ident->isStr("omp") && !ExpectAndConsume(tok::coloncolon))
+      if (Ident && (Ident->isStr("omp") || Ident->isStr("ompx")) &&
+          !ExpectAndConsume(tok::coloncolon)) {
+        if (Ident->isStr("ompx"))
+          isOpenMPExtension = true;
         Ident = TryParseCXX11AttributeIdentifier(IdentLoc);
+      }
 
       // If we failed to find an identifier (scoped or otherwise), or we found
       // an unexpected identifier, diagnose.
@@ -4348,7 +4358,7 @@ void Parser::ParseOpenMPAttributeArgs(IdentifierInfo *AttrName,
       }
       // We read an identifier. If the identifier is one of the ones we
       // expected, we can recurse to parse the args.
-      ParseOpenMPAttributeArgs(Ident, OpenMPTokens);
+      ParseOpenMPAttributeArgs(Ident, OpenMPTokens, isOpenMPExtension);
 
       // There may be a comma to signal that we expect another directive in the
       // sequence.
@@ -4432,12 +4442,12 @@ bool Parser::ParseCXX11AttributeArgs(
     return true;
   }
 
-  if (ScopeName && ScopeName->isStr("omp")) {
+  if (ScopeName && (ScopeName->isStr("omp") || ScopeName->isStr("ompx"))) {
     Diag(AttrNameLoc, getLangOpts().OpenMP >= 51
                           ? diag::warn_omp51_compat_attributes
                           : diag::ext_omp_attributes);
 
-    ParseOpenMPAttributeArgs(AttrName, OpenMPTokens);
+    ParseOpenMPAttributeArgs(AttrName, OpenMPTokens, ScopeName->isStr("ompx"));
 
     // We claim that an attribute was parsed and added so that one is not
     // created for us by the caller.
diff --git a/clang/lib/Parse/ParseOpenMP.cpp b/clang/lib/Parse/ParseOpenMP.cpp
index 605b97617432ed3..ba41b0f0dbaca16 100644
--- a/clang/lib/Parse/ParseOpenMP.cpp
+++ b/clang/lib/Parse/ParseOpenMP.cpp
@@ -1648,7 +1648,7 @@ void Parser::ParseOpenMPClauses(OpenMPDirectiveKind DKind,
     SkipUntil(tok::comma, tok::identifier, tok::annot_pragma_openmp_end,
               StopBeforeMatch);
     FirstClauses[unsigned(CKind)].setInt(true);
-    if (Clause != nullptr)
+    if (Clause != nullptr && CheckOpenMPClauseExtension(CKind, Loc))
       Clauses.push_back(Clause);
     if (Tok.is(tok::annot_pragma_openmp_end)) {
       Actions.EndOpenMPClause();
@@ -2042,8 +2042,14 @@ void Parser::ParseOMPEndDeclareTargetDirective(OpenMPDirectiveKind BeginDKind,
 Parser::DeclGroupPtrTy Parser::ParseOpenMPDeclarativeDirectiveWithExtDecl(
     AccessSpecifier &AS, ParsedAttributes &Attrs, bool Delayed,
     DeclSpec::TST TagType, Decl *Tag) {
-  assert(Tok.isOneOf(tok::annot_pragma_openmp, tok::annot_attr_openmp) &&
+  assert(Tok.isOneOf(tok::annot_pragma_openmp, tok::annot_attr_openmp,
+                     tok::annot_pragma_openmp_extension,
+                     tok::annot_attr_openmp_extension) &&
          "Not an OpenMP directive!");
+
+  bool isOmpx = Tok.isOneOf(tok::annot_pragma_openmp_extension,
+                            tok::annot_attr_openmp_extension);
+
   ParsingOpenMPDirectiveRAII DirScope(*this);
   ParenBraceBracketBalancer BalancerRAIIObj(*this);
 
@@ -2061,7 +2067,9 @@ Parser::DeclGroupPtrTy Parser::ParseOpenMPDeclarativeDirectiveWithExtDecl(
       Toks.push_back(Tok);
       while (Cnt && Tok.isNot(tok::eof)) {
         (void)ConsumeAnyToken();
-        if (Tok.isOneOf(tok::annot_pragma_openmp, tok::annot_attr_openmp))
+        if (Tok.isOneOf(tok::annot_pragma_openmp, tok::annot_attr_openmp,
+                        tok::annot_pragma_openmp_extension,
+                        tok::annot_attr_openmp_extension))
           ++Cnt;
         else if (Tok.is(tok::annot_pragma_openmp_end))
           --Cnt;
@@ -2081,6 +2089,24 @@ Parser::DeclGroupPtrTy Parser::ParseOpenMPDeclarativeDirectiveWithExtDecl(
     DKind = parseOpenMPDirectiveKind(*this);
   }
 
+  // Check if it is extension directive.
+  // Extension directives must have extension directives
+  // enabled and must use the ompx sentinel
+  if (isExtensionDirective(DKind)) {
+    if (!isOmpx)
+      Diag(Loc, diag::err_omp_extension_without_ompx)
+          << getOpenMPDirectiveName(DKind);
+    else if (!getLangOpts().OpenMPExtensions) {
+      Diag(Loc, diag::warn_omp_extension_directive_not_enabled)
+          << getOpenMPDirectiveName(DKind);
+      ConsumeToken();
+      skipUntilPragmaOpenMPEnd(DKind);
+      if (Tok.is(tok::annot_pragma_openmp_end))
+        ConsumeAnnotationToken();
+      return nullptr;
+    }
+  }
+
   switch (DKind) {
   case OMPD_threadprivate: {
     ConsumeToken();
@@ -2299,7 +2325,8 @@ Parser::DeclGroupPtrTy Parser::ParseOpenMPDeclarativeDirectiveWithExtDecl(
     ConsumeAnyToken();
 
     DeclGroupPtrTy Ptr;
-    if (Tok.isOneOf(tok::annot_pragma_openmp, tok::annot_attr_openmp)) {
+    if (Tok.isOneOf(tok::annot_pragma_openmp, tok::annot_attr_openmp,
+                    tok::annot_pragma_openmp_extension)) {
       Ptr = ParseOpenMPDeclarativeDirectiveWithExtDecl(AS, Attrs, Delayed,
                                                        TagType, Tag);
     } else if (Tok.isNot(tok::r_brace) && !isEofOrEom()) {
@@ -2374,6 +2401,7 @@ Parser::DeclGroupPtrTy Parser::ParseOpenMPDeclarativeDirectiveWithExtDecl(
   case OMPD_taskyield:
   case OMPD_barrier:
   case OMPD_taskwait:
+  case OMPD_taskgraph:
   case OMPD_taskgroup:
   case OMPD_flush:
   case OMPD_depobj:
@@ -2491,8 +2519,13 @@ Parser::DeclGroupPtrTy Parser::ParseOpenMPDeclarativeDirectiveWithExtDecl(
 StmtResult Parser::ParseOpenMPDeclarativeOrExecutableDirective(
     ParsedStmtContext StmtCtx, bool ReadDirectiveWithinMetadirective) {
   if (!ReadDirectiveWithinMetadirective)
-    assert(Tok.isOneOf(tok::annot_pragma_openmp, tok::annot_attr_openmp) &&
+    assert(Tok.isOneOf(tok::annot_pragma_openmp, tok::annot_attr_openmp,
+                       tok::annot_pragma_openmp_extension,
+                       tok::annot_attr_openmp_extension) &&
            "Not an OpenMP directive!");
+
+  bool isOmpx = Tok.isOneOf(tok::annot_pragma_openmp_extension,
+                            tok::annot_attr_openmp_extension);
   ParsingOpenMPDirectiveRAII DirScope(*this);
   ParenBraceBracketBalancer BalancerRAIIObj(*this);
   SmallVector<OMPClause *, 5> Clauses;
@@ -2516,6 +2549,24 @@ StmtResult Parser::ParseOpenMPDeclarativeOrExecutableDirective(
   StmtResult Directive = StmtError();
   bool HasAssociatedStatement = true;
 
+  // Check if it is extension directive.
+  // Extension directives must have extension directives
+  // enabled and must use the ompx sentinel
+  if (isExtensionDirective(DKind)) {
+    if (!isOmpx)
+      Diag(Loc, diag::err_omp_extension_without_ompx)
+          << getOpenMPDirectiveName(DKind);
+    else if (!getLangOpts().OpenMPExtensions) {
+      Diag(Loc, diag::warn_omp_extension_directive_not_enabled)
+          << getOpenMPDirectiveName(DKind);
+      ConsumeToken();
+      skipUntilPragmaOpenMPEnd(DKind);
+      if (Tok.is(tok::annot_pragma_openmp_end))
+        ConsumeAnnotationToken();
+      return Directive;
+    }
+  }
+
   switch (DKind) {
   case OMPD_nothing:
     if ((StmtCtx & ParsedStmtContext::AllowStandaloneOpenMPDirectives) ==
@@ -2798,6 +2849,7 @@ StmtResult Parser::ParseOpenMPDeclarativeOrExecutableDirective(
   case OMPD_parallel_master:
   case OMPD_parallel_masked:
   case OMPD_task:
+  case OMPD_taskgraph:
   case OMPD_ordered:
   case OMPD_atomic:
   case OMPD_target:
@@ -3002,6 +3054,16 @@ StmtResult Parser::ParseOpenMPDeclarativeOrExecutableDirective(
   return Directive;
 }
 
+bool Parser::CheckOpenMPClauseExtension(OpenMPClauseKind CKind,
+                                        SourceLocation Loc) {
+  if (!getLangOpts().OpenMPExtensions && isExtensionClause(CKind)) {
+    Diag(Loc, diag::warn_omp_extension_clause_not_enabled)
+        << getOpenMPClauseName(CKind);
+    return false;
+  }
+  return true;
+}
+
 // Parses simple list:
 //   simple-variable-list:
 //         '(' id-expression {, id-expression} ')'
@@ -3184,6 +3246,10 @@ OMPClause *Parser::ParseOpenMPClause(OpenMPDirectiveKind DKind,
     WrongDirective = true;
   }
 
+  // Check if it is an extension
+  if (!CheckOpenMPClauseExtension(CKind, Tok.getLocation()))
+    ErrorFound = true;
+
   switch (CKind) {
   case OMPC_final:
   case OMPC_num_threads:
diff --git a/clang/lib/Parse/ParsePragma.cpp b/clang/lib/Parse/ParsePragma.cpp
index b3178aef64d72d7..515d7b27169b06d 100644
--- a/clang/lib/Parse/ParsePragma.cpp
+++ b/clang/lib/Parse/ParsePragma.cpp
@@ -178,6 +178,18 @@ struct PragmaOpenMPHandler : public PragmaHandler {
                     Token &FirstToken) override;
 };
 
+struct PragmaNoOpenMPXHandler : public PragmaHandler {
+  PragmaNoOpenMPXHandler() : PragmaHandler("ompx") {}
+  void HandlePragma(Preprocessor &PP, PragmaIntroducer Introducer,
+                    Token &FirstToken) override;
+};
+
+struct PragmaOpenMPXHandler : public PragmaHandler {
+  PragmaOpenMPXHandler() : PragmaHandler("ompx") {}
+  void HandlePragma(Preprocessor &PP, PragmaIntroducer Introducer,
+                    Token &FirstToken) override;
+};
+
 /// PragmaCommentHandler - "\#pragma comment ...".
 struct PragmaCommentHandler : public PragmaHandler {
   PragmaCommentHandler(Sema &Actions)
@@ -417,11 +429,15 @@ void Parser::initializePragmaHandlers() {
 
     PP.AddPragmaHandler("OPENCL", FPContractHandler.get());
   }
-  if (getLangOpts().OpenMP)
+  if (getLangOpts().OpenMP) {
     OpenMPHandler = std::make_unique<PragmaOpenMPHandler>();
-  else
+    OpenMPXHandler = std::make_unique<PragmaOpenMPXHandler>();
+  } else {
     OpenMPHandler = std::make_unique<PragmaNoOpenMPHandler>();
+    OpenMPXHandler = std::make_unique<PragmaNoOpenMPXHandler>();
+  }
   PP.AddPragmaHandler(OpenMPHandler.get());
+  PP.AddPragmaHandler(OpenMPXHandler.get());
 
   if (getLangOpts().MicrosoftExt ||
       getTargetInfo().getTriple().isOSBinFormatELF()) {
@@ -540,7 +556,9 @@ void Parser::resetPragmaHandlers() {
     PP.RemovePragmaHandler("OPENCL", FPContractHandler.get());
   }
   PP.RemovePragmaHandler(OpenMPHandler.get());
+  PP.RemovePragmaHandler(OpenMPXHandler.get());
   OpenMPHandler.reset();
+  OpenMPXHandler.reset();
 
   if (getLangOpts().MicrosoftExt ||
       getTargetInfo().getTriple().isOSBinFormatELF()) {
@@ -2663,6 +2681,59 @@ void PragmaOpenMPHandler::HandlePragma(Preprocessor &PP,
                       /*DisableMacroExpansion=*/false, /*IsReinject=*/false);
 }
 
+/// Handle '#pragma ompx ...' when OpenMP is disabled.
+///
+void PragmaNoOpenMPXHandler::HandlePragma(Preprocessor &PP,
+                                          PragmaIntroducer Introducer,
+                                          Token &FirstTok) {
+  if (!PP.getDiagnostics().isIgnored(diag::warn_pragma_omp_ignored,
+                                     FirstTok.getLocation())) {
+    PP.Diag(FirstTok, diag::warn_pragma_omp_ignored);
+    PP.getDiagnostics().setSeverity(diag::warn_pragma_omp_ignored,
+                                    diag::Severity::Ignored, SourceLocation());
+  }
+  PP.DiscardUntilEndOfDirective();
+}
+
+/// Handle '#pragma ompx ...' when OpenMP is enabled.
+///
+void PragmaOpenMPXHandler::HandlePragma(Preprocessor &PP,
+                                        PragmaIntroducer Introducer,
+                                        Token &FirstTok) {
+  SmallVector<Token, 16> Pragma;
+  Token Tok;
+  Tok.startToken();
+  Tok.setKind(tok::annot_pragma_openmp_extension);
+  Tok.setLocation(Introducer.Loc);
+
+  while (Tok.isNot(tok::eod) && Tok.isNot(tok::eof)) {
+    Pragma.push_back(Tok);
+    PP.Lex(Tok);
+    if (Tok.is(tok::annot_pragma_openmp_extension)) {
+      PP.Diag(Tok, diag::err_omp_unexpected_directive) << 0;
+      unsigned InnerPragmaCnt = 1;
+      while (InnerPragmaCnt != 0) {
+        PP.Lex(Tok);
+        if (Tok.is(tok::annot_pragma_openmp_extension))
+          ++InnerPragmaCnt;
+        else if (Tok.is(tok::annot_pragma_openmp_end))
+          --InnerPragmaCnt;
+      }
+      PP.Lex(Tok);
+    }
+  }
+  SourceLocation EodLoc = Tok.getLocation();
+  Tok.startToken();
+  Tok.setKind(tok::annot_pragma_openmp_end);
+  Tok.setLocation(EodLoc);
+  Pragma.push_back(Tok);
+
+  auto Toks = std::make_unique<Token[]>(Pragma.size());
+  std::copy(Pragma.begin(), Pragma.end(), Toks.get());
+  PP.EnterTokenStream(std::move(Toks), Pragma.size(),
+                      /*DisableMacroExpansion=*/false, /*IsReinject=*/false);
+}
+
 /// Handle '#pragma pointers_to_members'
 // The grammar for this pragma is as follows:
 //
diff --git a/clang/lib/Parse/ParseStmt.cpp b/clang/lib/Parse/ParseStmt.cpp
index 2531147c23196ae..0e1220c1594ec14 100644
--- a/clang/lib/Parse/ParseStmt.cpp
+++ b/clang/lib/Parse/ParseStmt.cpp
@@ -466,6 +466,7 @@ StmtResult Parser::ParseStatementOrDeclarationAfterAttributes(
     return HandlePragmaCaptured();
 
   case tok::annot_pragma_openmp:
+  case tok::annot_pragma_openmp_extension:
     // Prohibit attributes that are not OpenMP attributes, but only before
     // processing a #pragma omp clause.
     ProhibitAttributes(CXX11Attrs);
diff --git a/clang/lib/Parse/Parser.cpp b/clang/lib/Parse/Parser.cpp
index 09215b8303ecf9c..cc1741a06654d3f 100644
--- a/clang/lib/Parse/Parser.cpp
+++ b/clang/lib/Parse/Parser.cpp
@@ -309,6 +309,8 @@ bool Parser::SkipUntil(ArrayRef<tok::TokenKind> Toks, SkipUntilFlags Flags) {
       // Ran out of tokens.
       return false;
 
+    case tok::annot_pragma_openmp_extension:
+    case tok::annot_attr_openmp_extension:
     case tok::annot_pragma_openmp:
     case tok::annot_attr_openmp:
     case tok::annot_pragma_openmp_end:
@@ -845,6 +847,10 @@ Parser::ParseExternalDeclaration(ParsedAttributes &Attrs,
     AccessSpecifier AS = AS_none;
     return ParseOpenMPDeclarativeDirectiveWithExtDecl(AS, Attrs);
   }
+  case tok::annot_pragma_openmp_extension: {
+    AccessSpecifier AS = AS_none;
+    return ParseOpenMPDeclarativeDirectiveWithExtDecl(AS, Attrs);
+  }
   case tok::annot_pragma_ms_pointers_to_members:
     HandlePragmaMSPointersToMembers();
     return nullptr;
diff --git a/clang/lib/Sema/SemaExceptionSpec.cpp b/clang/lib/Sema/SemaExceptionSpec.cpp
index 72304e51ea8ef3a..2476a6d7d99e5a4 100644
--- a/clang/lib/Sema/SemaExceptionSpec.cpp
+++ b/clang/lib/Sema/SemaExceptionSpec.cpp
@@ -1499,6 +1499,7 @@ CanThrowResult Sema::canThrow(const Stmt *S) {
   case Stmt::OMPScopeDirectiveClass:
   case Stmt::OMPTaskDirectiveClass:
   case Stmt::OMPTaskgroupDirectiveClass:
+  case Stmt::OMPTaskgraphDirectiveClass:
   case Stmt::OMPTaskLoopDirectiveClass:
   case Stmt::OMPTaskLoopSimdDirectiveClass:
   case Stmt::OMPTaskwaitDirectiveClass:
diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp
index 46eae3596d2a8fe..ef2b292f490799c 100644
--- a/clang/lib/Sema/SemaOpenMP.cpp
+++ b/clang/lib/Sema/SemaOpenMP.cpp
@@ -4316,6 +4316,7 @@ void Sema::ActOnOpenMPRegionStart(OpenMPDirectiveKind DKind, Scope *CurScope) {
   case OMPD_for_simd:
   case OMPD_sections:
   case OMPD_single:
+  case OMPD_taskgraph:
   case OMPD_taskgroup:
   case OMPD_distribute:
   case OMPD_distribute_simd:
@@ -6502,6 +6503,12 @@ StmtResult Sema::ActOnOpenMPExecutableDirective(
            "No associated statement allowed for 'omp taskwait' directive");
     Res = ActOnOpenMPTaskwaitDirective(ClausesWithImplicit, StartLoc, EndLoc);
     break;
+  case OMPD_taskgraph:
+    assert(AStmt != nullptr &&
+           "Associated statement required for 'ompx taskgraph' directive");
+    Res = ActOnOpenMPTaskgraphDirective(ClausesWithImplicit, AStmt, StartLoc,
+                                        EndLoc);
+    break;
   case OMPD_taskgroup:
     Res = ActOnOpenMPTaskgroupDirective(ClausesWithImplicit, AStmt, StartLoc,
                                         EndLoc);
@@ -11351,6 +11358,19 @@ StmtResult Sema::ActOnOpenMPTaskwaitDirective(ArrayRef<OMPClause *> Clauses,
   return OMPTaskwaitDirective::Create(Context, StartLoc, EndLoc, Clauses);
 }
 
+StmtResult Sema::ActOnOpenMPTaskgraphDirective(ArrayRef<OMPClause *> Clauses,
+                                               Stmt *AStmt,
+                                               SourceLocation StartLoc,
+                                               SourceLocation EndLoc) {
+  if (!AStmt)
+    return StmtError();
+
+  assert(isa<CapturedStmt>(AStmt) && "Captured statement expected");
+
+  return OMPTaskgraphDirective::Create(Context, StartLoc, EndLoc, Clauses,
+                                       AStmt);
+}
+
 StmtResult Sema::ActOnOpenMPTaskgroupDirective(ArrayRef<OMPClause *> Clauses,
                                                Stmt *AStmt,
                                                SourceLocation StartLoc,
@@ -15660,6 +15680,7 @@ static OpenMPDirectiveKind getOpenMPCaptureRegionForClause(
     case OMPD_target_teams_distribute:
     case OMPD_distribute_parallel_for:
     case OMPD_task:
+    case OMPD_taskgraph:
     case OMPD_taskloop:
     case OMPD_master_taskloop:
     case OMPD_masked_taskloop:
@@ -15752,6 +15773,7 @@ static OpenMPDirectiveKind getOpenMPCaptureRegionForClause(
     case OMPD_target_teams_distribute_simd:
     case OMPD_cancel:
     case OMPD_task:
+    case OMPD_taskgraph:
     case OMPD_taskloop:
     case OMPD_taskloop_simd:
     case OMPD_master_taskloop:
@@ -15827,6 +15849,7 @@ static OpenMPDirectiveKind getOpenMPCaptureRegionForClause(
     case OMPD_distribute_parallel_for:
     case OMPD_distribute_parallel_for_simd:
     case OMPD_task:
+    case OMPD_taskgraph:
     case OMPD_taskloop:
     case OMPD_taskloop_simd:
     case OMPD_master_taskloop:
@@ -15925,6 +15948,7 @@ static OpenMPDirectiveKind getOpenMPCaptureRegionForClause(
     case OMPD_distribute_parallel_for:
     case OMPD_distribute_parallel_for_simd:
     case OMPD_task:
+    case OMPD_taskgraph:
     case OMPD_taskloop:
     case OMPD_taskloop_simd:
     case OMPD_master_taskloop:
@@ -16009,6 +16033,7 @@ static OpenMPDirectiveKind getOpenMPCaptureRegionForClause(
       // Do not capture schedule-clause expressions.
       break;
     case OMPD_task:
+    case OMPD_taskgraph:
     case OMPD_taskloop:
     case OMPD_taskloop_simd:
     case OMPD_master_taskloop:
@@ -16105,6 +16130,7 @@ static OpenMPDirectiveKind getOpenMPCaptureRegionForClause(
     case OMPD_target_parallel_for_simd:
     case OMPD_target_parallel_for:
     case OMPD_task:
+    case OMPD_taskgraph:
     case OMPD_taskloop:
     case OMPD_taskloop_simd:
     case OMPD_master_taskloop:
diff --git a/clang/lib/Sema/TreeTransform.h b/clang/lib/Sema/TreeTransform.h
index 603a23275889f21..a28c67cdc2983d6 100644
--- a/clang/lib/Sema/TreeTransform.h
+++ b/clang/lib/Sema/TreeTransform.h
@@ -9072,6 +9072,17 @@ TreeTransform<Derived>::TransformOMPTaskwaitDirective(OMPTaskwaitDirective *D) {
   return Res;
 }
 
+template <typename Derived>
+StmtResult TreeTransform<Derived>::TransformOMPTaskgraphDirective(
+    OMPTaskgraphDirective *D) {
+  DeclarationNameInfo DirName;
+  getDerived().getSema().StartOpenMPDSABlock(OMPD_taskgraph, DirName, nullptr,
+                                             D->getBeginLoc());
+  StmtResult Res = getDerived().TransformOMPExecutableDirective(D);
+  getDerived().getSema().EndOpenMPDSABlock(Res.get());
+  return Res;
+}
+
 template <typename Derived>
 StmtResult
 TreeTransform<Derived>::TransformOMPErrorDirective(OMPErrorDirective *D) {
diff --git a/clang/lib/Serialization/ASTReaderStmt.cpp b/clang/lib/Serialization/ASTReaderStmt.cpp
index b9d934983929933..bb63ca572a49b74 100644
--- a/clang/lib/Serialization/ASTReaderStmt.cpp
+++ b/clang/lib/Serialization/ASTReaderStmt.cpp
@@ -2460,6 +2460,11 @@ void ASTStmtReader::VisitOMPTaskwaitDirective(OMPTaskwaitDirective *D) {
   VisitOMPExecutableDirective(D);
 }
 
+void ASTStmtReader::VisitOMPTaskgraphDirective(OMPTaskgraphDirective *D) {
+  VisitStmt(D);
+  VisitOMPExecutableDirective(D);
+}
+
 void ASTStmtReader::VisitOMPErrorDirective(OMPErrorDirective *D) {
   VisitStmt(D);
   // The NumClauses field was read in ReadStmtFromStream.
@@ -3421,6 +3426,11 @@ Stmt *ASTReader::ReadStmtFromStream(ModuleFile &F) {
           Context, Record[ASTStmtReader::NumStmtFields], Empty);
       break;
 
+    case STMT_OMP_TASKGRAPH_DIRECTIVE:
+      S = OMPTaskgraphDirective::CreateEmpty(
+          Context, Record[ASTStmtReader::NumStmtFields], Empty);
+      break;
+
     case STMT_OMP_ERROR_DIRECTIVE:
       S = OMPErrorDirective::CreateEmpty(
           Context, Record[ASTStmtReader::NumStmtFields], Empty);
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index 65bee806d2c5571..237ffc502234d4d 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -4501,6 +4501,7 @@ void ASTWriter::AddToken(const Token &Tok, RecordDataImpl &Record) {
       break;
     }
     // Some annotation tokens do not use the PtrData field.
+    case tok::annot_pragma_openmp_extension:
     case tok::annot_pragma_openmp:
     case tok::annot_pragma_openmp_end:
     case tok::annot_pragma_unused:
diff --git a/clang/lib/Serialization/ASTWriterStmt.cpp b/clang/lib/Serialization/ASTWriterStmt.cpp
index 94d3f9430d27a0a..34059e60de32e96 100644
--- a/clang/lib/Serialization/ASTWriterStmt.cpp
+++ b/clang/lib/Serialization/ASTWriterStmt.cpp
@@ -2442,6 +2442,12 @@ void ASTStmtWriter::VisitOMPTaskwaitDirective(OMPTaskwaitDirective *D) {
   Code = serialization::STMT_OMP_TASKWAIT_DIRECTIVE;
 }
 
+void ASTStmtWriter::VisitOMPTaskgraphDirective(OMPTaskgraphDirective *D) {
+  VisitStmt(D);
+  VisitOMPExecutableDirective(D);
+  Code = serialization::STMT_OMP_TASKGRAPH_DIRECTIVE;
+}
+
 void ASTStmtWriter::VisitOMPErrorDirective(OMPErrorDirective *D) {
   VisitStmt(D);
   Record.push_back(D->getNumClauses());
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index 0e2ac78f7089c55..e5cf9fc55d22f02 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -1759,6 +1759,7 @@ void ExprEngine::Visit(const Stmt *S, ExplodedNode *Pred,
     case Stmt::OMPTaskyieldDirectiveClass:
     case Stmt::OMPBarrierDirectiveClass:
     case Stmt::OMPTaskwaitDirectiveClass:
+    case Stmt::OMPTaskgraphDirectiveClass:
     case Stmt::OMPErrorDirectiveClass:
     case Stmt::OMPTaskgroupDirectiveClass:
     case Stmt::OMPFlushDirectiveClass:
diff --git a/clang/test/OpenMP/ompx_extensions_messages.cpp b/clang/test/OpenMP/ompx_extensions_messages.cpp
new file mode 100644
index 000000000000000..adf0c73aaf271e2
--- /dev/null
+++ b/clang/test/OpenMP/ompx_extensions_messages.cpp
@@ -0,0 +1,12 @@
+// RUN: %clang_cc1 -verify=expected -fopenmp -fno-openmp-extensions -ferror-limit 100 -o - -std=c++11 %s -Wuninitialized
+
+void bad() {
+  #pragma omp taskgraph //  expected-error {{Using extension directive 'taskgraph' in #pragma omp instead of #pragma ompx}}
+  {} 
+  #pragma ompx taskgraph //  expected-warning {{OpenMP Extensions not enabled. Ignoring OpenMP Extension Directive '#pragma ompx taskgraph'}}
+  {}
+  #pragma omp target ompx_attribute()  //  expected-warning {{OpenMP Extensions not enabled. Ignoring OpenMP Extension Clause 'ompx_attribute'}}
+  {} 
+  #pragma omp target ompx_dyn_cgroup_mem(1024) //  expected-warning {{OpenMP Extensions not enabled. Ignoring OpenMP Extension Clause 'ompx_dyn_cgroup_mem'}}
+  {} 
+}
diff --git a/clang/test/OpenMP/ompx_taskgraph_ast_print.cpp b/clang/test/OpenMP/ompx_taskgraph_ast_print.cpp
new file mode 100644
index 000000000000000..b3a9e2312b9ba13
--- /dev/null
+++ b/clang/test/OpenMP/ompx_taskgraph_ast_print.cpp
@@ -0,0 +1,34 @@
+// RUN: %clang_cc1 -verify -fopenmp -ast-print %s | FileCheck %s
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -emit-pch -o %t %s
+// RUN: %clang_cc1 -fopenmp -std=c++11 -include-pch %t -fsyntax-only -verify %s -ast-print | FileCheck %s
+
+// RUN: %clang_cc1 -verify -fopenmp-simd -ast-print %s | FileCheck %s
+// RUN: %clang_cc1 -fopenmp-simd -x c++ -std=c++11 -emit-pch -o %t %s
+// RUN: %clang_cc1 -fopenmp-simd -std=c++11 -include-pch %t -fsyntax-only -verify %s -ast-print | FileCheck %s
+// expected-no-diagnostics
+
+#ifndef HEADER
+#define HEADER
+
+int main() {
+// CHECK: #pragma ompx taskgraph
+#pragma ompx taskgraph
+{}
+// CHECK: int foo = 0;
+// CHECK-NEXT: #pragma ompx taskgraph
+  int foo = 0;
+#pragma ompx taskgraph
+{
+  foo++;
+}
+// CHECK: #pragma ompx taskgraph
+  for(int i = 0; i < 10; ++i)
+#pragma ompx taskgraph
+{
+  #pragma omp task
+    foo++;
+}
+  return 0;
+}
+
+#endif
\ No newline at end of file
diff --git a/clang/test/OpenMP/ompx_taskgraph_codegen.cpp b/clang/test/OpenMP/ompx_taskgraph_codegen.cpp
new file mode 100644
index 000000000000000..6ddb64153265b13
--- /dev/null
+++ b/clang/test/OpenMP/ompx_taskgraph_codegen.cpp
@@ -0,0 +1,29 @@
+// RUN: %clang_cc1 -fopenmp -x c++ -triple x86_64-unknown-unknown -emit-llvm -fexceptions -fcxx-exceptions -o - %s | FileCheck %s
+
+// RUN: %clang_cc1 -fopenmp-simd -x c++ -triple x86_64-unknown-unknown -emit-llvm -fexceptions -fcxx-exceptions -o - %s | FileCheck --check-prefix SIMD-ONLY0 %s
+// SIMD-ONLY0-NOT: {{__kmpc|__tgt}}
+
+int main() {
+// CHECK: call i32 @__kmpc_global_thread_num
+// CHECK: call void @__kmpc_taskgraph
+// CHECK: @taskgraph.omp_outlined.
+#pragma ompx taskgraph
+{}
+// CHECK: call void @__kmpc_taskgraph
+// CHECK: @taskgraph.omp_outlined..1
+  int foo = 0;
+#pragma ompx taskgraph
+{
+  foo++;
+}
+// CHECK: call void @__kmpc_taskgraph
+// CHECK: @taskgraph.omp_outlined..2
+  for(int i = 0; i < 10; ++i)
+#pragma ompx taskgraph
+{
+  #pragma omp task
+    foo++;
+}
+  return 0;
+}
+
diff --git a/clang/tools/libclang/CIndex.cpp b/clang/tools/libclang/CIndex.cpp
index f0c8ecfcb6264fb..6cd5f203963918b 100644
--- a/clang/tools/libclang/CIndex.cpp
+++ b/clang/tools/libclang/CIndex.cpp
@@ -5977,6 +5977,8 @@ CXString clang_getCursorKindSpelling(enum CXCursorKind Kind) {
     return cxstring::createRef("OMPBarrierDirective");
   case CXCursor_OMPTaskwaitDirective:
     return cxstring::createRef("OMPTaskwaitDirective");
+  case CXCursor_OMPTaskgraphDirective:
+    return cxstring::createRef("OMPTaskgraphDirective");
   case CXCursor_OMPErrorDirective:
     return cxstring::createRef("OMPErrorDirective");
   case CXCursor_OMPTaskgroupDirective:
diff --git a/clang/tools/libclang/CXCursor.cpp b/clang/tools/libclang/CXCursor.cpp
index fd03c48ba1a42aa..2d87a649f926a39 100644
--- a/clang/tools/libclang/CXCursor.cpp
+++ b/clang/tools/libclang/CXCursor.cpp
@@ -719,6 +719,9 @@ CXCursor cxcursor::MakeCXCursor(const Stmt *S, const Decl *Parent,
   case Stmt::OMPTaskwaitDirectiveClass:
     K = CXCursor_OMPTaskwaitDirective;
     break;
+  case Stmt::OMPTaskgraphDirectiveClass:
+    K = CXCursor_OMPTaskgraphDirective;
+    break;
   case Stmt::OMPErrorDirectiveClass:
     K = CXCursor_OMPErrorDirective;
     break;
diff --git a/llvm/include/llvm/Frontend/Directive/DirectiveBase.td b/llvm/include/llvm/Frontend/Directive/DirectiveBase.td
index 4269a966a988d77..7cb5e93e29e67c4 100644
--- a/llvm/include/llvm/Frontend/Directive/DirectiveBase.td
+++ b/llvm/include/llvm/Frontend/Directive/DirectiveBase.td
@@ -113,6 +113,9 @@ class Clause<string c> {
   // Set the prefix as optional.
   // `clause([prefix]: value)`
   bit isPrefixOptional = true;
+
+  // Is extension (i.e. ompx)
+  bit isExtension = false;
 }
 
 // Hold information about clause validity by version.
@@ -154,4 +157,7 @@ class Directive<string d> {
 
   // Set directive used by default when unknown.
   bit isDefault = false;
+
+  // Is extension (i.e. ompx)
+  bit isExtension = false;
 }
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td
index 1aaddcaf3969dcb..6d867e9923cc114 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMP.td
+++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td
@@ -442,6 +442,7 @@ def OMPC_Bind : Clause<"bind"> {
 def OMPC_OMPX_DynCGroupMem : Clause<"ompx_dyn_cgroup_mem"> {
   let clangClass = "OMPXDynCGroupMemClause";
   let flangClass = "ScalarIntExpr";
+  let isExtension = true;
 }
 
 def OMPC_Doacross : Clause<"doacross"> {
@@ -450,6 +451,7 @@ def OMPC_Doacross : Clause<"doacross"> {
 
 def OMPC_OMPX_Attribute : Clause<"ompx_attribute"> {
   let clangClass = "OMPXAttributeClause";
+  let isExtension = true;
 }
 
 //===----------------------------------------------------------------------===//
@@ -595,6 +597,9 @@ def OMP_TaskWait : Directive<"taskwait"> {
     VersionedClause<OMPC_NoWait, 51>
   ];
 }
+def OMP_TaskGraph : Directive<"taskgraph"> {
+  let isExtension = true;
+}
 def OMP_TaskGroup : Directive<"taskgroup"> {
   let allowedClauses = [
     VersionedClause<OMPC_TaskReduction, 50>,
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPKinds.def b/llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
index c4218326280b2ba..e5e8f2541f0f5a4 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
+++ b/llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
@@ -350,6 +350,7 @@ __OMP_RTL(__kmpc_omp_task_alloc, false, /* kmp_task_t */ VoidPtr, IdentPtr,
           Int32, Int32, SizeTy, SizeTy, TaskRoutineEntryPtr)
 __OMP_RTL(__kmpc_omp_task, false, Int32, IdentPtr, Int32,
           /* kmp_task_t */ VoidPtr)
+__OMP_RTL(__kmpc_taskgraph, false, Void, IdentPtr, Int32, Int32, Int32, VoidPtr, VoidPtr)
 __OMP_RTL(__kmpc_end_taskgroup, false, Void, IdentPtr, Int32)
 __OMP_RTL(__kmpc_taskgroup, false, Void, IdentPtr, Int32)
 __OMP_RTL(__kmpc_omp_task_begin_if0, false, Void, IdentPtr, Int32,
diff --git a/llvm/include/llvm/TableGen/DirectiveEmitter.h b/llvm/include/llvm/TableGen/DirectiveEmitter.h
index 4bca4b13d729ab6..7caa7f9e1b38f90 100644
--- a/llvm/include/llvm/TableGen/DirectiveEmitter.h
+++ b/llvm/include/llvm/TableGen/DirectiveEmitter.h
@@ -93,6 +93,8 @@ class BaseRecord {
 
   bool isDefault() const { return Def->getValueAsBit("isDefault"); }
 
+  bool isExtension() const { return Def->getValueAsBit("isExtension"); }
+
   // Returns the record name.
   StringRef getRecordName() const { return Def->getName(); }
 
diff --git a/llvm/test/TableGen/directive1.td b/llvm/test/TableGen/directive1.td
index b249f2bf5fc6890..00d6c8fa5336c4f 100644
--- a/llvm/test/TableGen/directive1.td
+++ b/llvm/test/TableGen/directive1.td
@@ -98,6 +98,10 @@ def TDL_DirA : Directive<"dira"> {
 // CHECK-EMPTY:
 // CHECK-NEXT:  /// Return true if \p C is a valid clause for \p D in version \p Version.
 // CHECK-NEXT:  bool isAllowedClauseForDirective(Directive D, Clause C, unsigned Version);
+// CHECK-NEXT:  /// Return true if \p C is an extension clause
+// CHECK-NEXT:  bool isExtensionClause(Clause C);
+// CHECK-NEXT:  /// Return true if \p D is an extension directive
+// CHECK-NEXT:  bool isExtensionDirective(Directive D);
 // CHECK-EMPTY:
 // CHECK-NEXT:  AKind getAKind(StringRef);
 // CHECK-NEXT:  llvm::StringRef getTdlAKindName(AKind);
@@ -341,4 +345,22 @@ def TDL_DirA : Directive<"dira"> {
 // IMPL-NEXT:    llvm_unreachable("Invalid Tdl Directive kind");
 // IMPL-NEXT:  }
 // IMPL-EMPTY:
+// IMPL-NEXT:  bool llvm::tdl::isExtensionClause(Clause C) {
+// IMPL-NEXT:    assert(unsigned(C) <= llvm::tdl::Clause_enumSize);
+// IMPL-NEXT:    switch (C) {
+// IMPL-NEXT:      default:
+// IMPL-NEXT:          return false;
+// IMPL-NEXT:    }
+// IMPL-NEXT:    llvm_unreachable("Invalid Tdl Clause kind");
+// IMPL-NEXT:  }
+// IMPL-EMPTY:
+// IMPL-NEXT:  bool llvm::tdl::isExtensionDirective(Directive D) {
+// IMPL-NEXT:    assert(unsigned(D) <= llvm::tdl::Directive_enumSize);
+// IMPL-NEXT:    switch (D) {
+// IMPL-NEXT:      default:
+// IMPL-NEXT:          return false;
+// IMPL-NEXT:    }
+// IMPL-NEXT:    llvm_unreachable("Invalid Tdl Directive kind");
+// IMPL-NEXT:  }
+// IMPL-EMPTY:
 // IMPL-NEXT:  #endif // GEN_DIRECTIVES_IMPL
diff --git a/llvm/test/TableGen/directive2.td b/llvm/test/TableGen/directive2.td
index 154d1e86ffb1d66..52215084f1e8f6a 100644
--- a/llvm/test/TableGen/directive2.td
+++ b/llvm/test/TableGen/directive2.td
@@ -73,6 +73,10 @@ def TDL_DirA : Directive<"dira"> {
 // CHECK-EMPTY:
 // CHECK-NEXT:  /// Return true if \p C is a valid clause for \p D in version \p Version.
 // CHECK-NEXT:  bool isAllowedClauseForDirective(Directive D, Clause C, unsigned Version);
+// CHECK-NEXT:  /// Return true if \p C is an extension clause
+// CHECK-NEXT:  bool isExtensionClause(Clause C);
+// CHECK-NEXT:  /// Return true if \p D is an extension directive
+// CHECK-NEXT:  bool isExtensionDirective(Directive D);
 // CHECK-EMPTY:
 // CHECK-NEXT:  } // namespace tdl
 // CHECK-NEXT:  } // namespace llvm
@@ -271,4 +275,22 @@ def TDL_DirA : Directive<"dira"> {
 // IMPL-NEXT:    llvm_unreachable("Invalid Tdl Directive kind");
 // IMPL-NEXT:  }
 // IMPL-EMPTY:
+// IMPL-NEXT:  bool llvm::tdl::isExtensionClause(Clause C) {
+// IMPL-NEXT:    assert(unsigned(C) <= llvm::tdl::Clause_enumSize);
+// IMPL-NEXT:    switch (C) {
+// IMPL-NEXT:      default:
+// IMPL-NEXT:          return false;
+// IMPL-NEXT:    }
+// IMPL-NEXT:    llvm_unreachable("Invalid Tdl Clause kind");
+// IMPL-NEXT:  }
+// IMPL-EMPTY:
+// IMPL-NEXT:  bool llvm::tdl::isExtensionDirective(Directive D) {
+// IMPL-NEXT:    assert(unsigned(D) <= llvm::tdl::Directive_enumSize);
+// IMPL-NEXT:    switch (D) {
+// IMPL-NEXT:      default:
+// IMPL-NEXT:          return false;
+// IMPL-NEXT:    }
+// IMPL-NEXT:    llvm_unreachable("Invalid Tdl Directive kind");
+// IMPL-NEXT:  }
+// IMPL-EMPTY:
 // IMPL-NEXT:  #endif // GEN_DIRECTIVES_IMPL
diff --git a/llvm/utils/TableGen/DirectiveEmitter.cpp b/llvm/utils/TableGen/DirectiveEmitter.cpp
index 67033c6290ca0e6..1a99082f75f92ae 100644
--- a/llvm/utils/TableGen/DirectiveEmitter.cpp
+++ b/llvm/utils/TableGen/DirectiveEmitter.cpp
@@ -230,6 +230,10 @@ static void EmitDirectivesDecl(RecordKeeper &Records, raw_ostream &OS) {
      << "Version.\n";
   OS << "bool isAllowedClauseForDirective(Directive D, "
      << "Clause C, unsigned Version);\n";
+  OS << "/// Return true if \\p C is an extension clause\n";
+  OS << "bool isExtensionClause(Clause C);\n";
+  OS << "/// Return true if \\p D is an extension directive\n";
+  OS << "bool isExtensionDirective(Directive D);\n";
   OS << "\n";
   if (EnumHelperFuncs.length() > 0) {
     OS << EnumHelperFuncs;
@@ -381,6 +385,68 @@ GenerateCaseForVersionedClauses(const std::vector<Record *> &Clauses,
   }
 }
 
+// Generate the isExtensionClause function implementation.
+static void GenerateIsExtensionClause(const DirectiveLanguage &DirLang,
+                                      raw_ostream &OS) {
+  OS << "\n";
+  OS << "bool llvm::" << DirLang.getCppNamespace()
+     << "::isExtensionClause(Clause C) {\n";
+  OS << "  assert(unsigned(C) <= llvm::" << DirLang.getCppNamespace()
+     << "::Clause_enumSize);\n";
+
+  OS << "  switch (C) {\n";
+
+  bool anyExtensionClause = false;
+  for (const auto &C : DirLang.getClauses()) {
+    Clause Dir{C};
+    if (Dir.isExtension()) {
+      OS << "    case " << DirLang.getClausePrefix() << Dir.getFormattedName()
+         << ":\n";
+      anyExtensionClause = true;
+    }
+  }
+  if (anyExtensionClause) {
+    OS << "      return true;\n";
+  }
+  OS << "    default:\n";
+  OS << "      return false;\n";
+  OS << "  }\n"; // End of clauses switch
+  OS << "  llvm_unreachable(\"Invalid " << DirLang.getName()
+     << " Clause kind\");\n";
+  OS << "}\n"; // End of function isExtensionClause
+}
+
+// Generate the isExtensionDirective function implementation.
+static void GenerateIsExtensionDirective(const DirectiveLanguage &DirLang,
+                                         raw_ostream &OS) {
+  OS << "\n";
+  OS << "bool llvm::" << DirLang.getCppNamespace()
+     << "::isExtensionDirective(Directive D) {\n";
+  OS << "  assert(unsigned(D) <= llvm::" << DirLang.getCppNamespace()
+     << "::Directive_enumSize);\n";
+
+  OS << "  switch (D) {\n";
+
+  bool anyExtensionDirective = false;
+  for (const auto &D : DirLang.getDirectives()) {
+    Directive Dir{D};
+    if (Dir.isExtension()) {
+      OS << "    case " << DirLang.getDirectivePrefix()
+         << Dir.getFormattedName() << ":\n";
+      anyExtensionDirective = true;
+    }
+  }
+  if (anyExtensionDirective) {
+    OS << "      return true;\n";
+  }
+  OS << "    default:\n";
+  OS << "      return false;\n";
+  OS << "  }\n"; // End of clauses switch
+  OS << "  llvm_unreachable(\"Invalid " << DirLang.getName()
+     << " Directive kind\");\n";
+  OS << "}\n"; // End of function isExtensionDirective
+}
+
 // Generate the isAllowedClauseForDirective function implementation.
 static void GenerateIsAllowedClause(const DirectiveLanguage &DirLang,
                                     raw_ostream &OS) {
@@ -875,6 +941,12 @@ void EmitDirectivesBasicImpl(const DirectiveLanguage &DirLang,
 
   // isAllowedClauseForDirective(Directive D, Clause C, unsigned Version)
   GenerateIsAllowedClause(DirLang, OS);
+
+  // isExtensionClause
+  GenerateIsExtensionClause(DirLang, OS);
+
+  // isExtensionDirective
+  GenerateIsExtensionDirective(DirLang, OS);
 }
 
 // Generate the implemenation section for the enumeration in the directive
diff --git a/openmp/runtime/CMakeLists.txt b/openmp/runtime/CMakeLists.txt
index 4441c4babdc07c0..5ac536e629f7824 100644
--- a/openmp/runtime/CMakeLists.txt
+++ b/openmp/runtime/CMakeLists.txt
@@ -347,10 +347,6 @@ if(LIBOMP_OMPD_SUPPORT AND ((NOT LIBOMP_OMPT_SUPPORT) OR (NOT "${CMAKE_SYSTEM_NA
   set(LIBOMP_OMPD_SUPPORT FALSE)
 endif()
 
-# OMPX Taskgraph support
-# Whether to build with OMPX Taskgraph (e.g. task record & replay)
-set(LIBOMP_OMPX_TASKGRAPH FALSE CACHE BOOL "OMPX-taskgraph (task record & replay)?")
-
 # Error check hwloc support after config-ix has run
 if(LIBOMP_USE_HWLOC AND (NOT LIBOMP_HAVE_HWLOC))
   libomp_error_say("Hwloc requested but not available")
@@ -420,7 +416,6 @@ if(${OPENMP_STANDALONE_BUILD})
   libomp_say("Use Adaptive locks   -- ${LIBOMP_USE_ADAPTIVE_LOCKS}")
   libomp_say("Use quad precision   -- ${LIBOMP_USE_QUAD_PRECISION}")
   libomp_say("Use Hwloc library    -- ${LIBOMP_USE_HWLOC}")
-  libomp_say("Use OMPX-taskgraph   -- ${LIBOMP_OMPX_TASKGRAPH}")
 endif()
 
 add_subdirectory(src)
diff --git a/openmp/runtime/src/kmp.h b/openmp/runtime/src/kmp.h
index b931b7ba66416ec..8f9f4c40cc84d02 100644
--- a/openmp/runtime/src/kmp.h
+++ b/openmp/runtime/src/kmp.h
@@ -2529,7 +2529,6 @@ typedef struct {
   } ed;
 } kmp_event_t;
 
-#if OMPX_TASKGRAPH
 // Initial number of allocated nodes while recording
 #define INIT_MAPSIZE 50
 
@@ -2580,11 +2579,10 @@ typedef struct kmp_tdg_info {
 extern int __kmp_tdg_dot;
 extern kmp_int32 __kmp_max_tdgs;
 extern kmp_tdg_info_t **__kmp_global_tdgs;
-extern kmp_int32 __kmp_curr_tdg_idx;
+extern kmp_int32 __kmp_curr_tdg_id;
 extern kmp_int32 __kmp_successors_size;
 extern std::atomic<kmp_int32> __kmp_tdg_task_id;
 extern kmp_int32 __kmp_num_tdg;
-#endif
 
 #ifdef BUILD_TIED_TASK_STACK
 
@@ -2633,12 +2631,8 @@ typedef struct kmp_tasking_flags { /* Total struct must be exactly 32 bits */
   unsigned complete : 1; /* 1==complete, 0==not complete   */
   unsigned freed : 1; /* 1==freed, 0==allocated        */
   unsigned native : 1; /* 1==gcc-compiled task, 0==intel */
-#if OMPX_TASKGRAPH
   unsigned onced : 1; /* 1==ran once already, 0==never ran, record & replay purposes */
   unsigned reserved31 : 6; /* reserved for library use */
-#else
-  unsigned reserved31 : 7; /* reserved for library use */
-#endif
 
 } kmp_tasking_flags_t;
 
@@ -2688,10 +2682,8 @@ struct kmp_taskdata { /* aligned during dynamic allocation       */
 #if OMPT_SUPPORT
   ompt_task_info_t ompt_task_info;
 #endif
-#if OMPX_TASKGRAPH
   bool is_taskgraph = 0; // whether the task is within a TDG
   kmp_tdg_info_t *tdg; // used to associate task with a TDG
-#endif
   kmp_target_data_t td_target_data;
 }; // struct kmp_taskdata
 
@@ -4251,7 +4243,6 @@ KMP_EXPORT void __kmpc_init_nest_lock_with_hint(ident_t *loc, kmp_int32 gtid,
                                                 void **user_lock,
                                                 uintptr_t hint);
 
-#if OMPX_TASKGRAPH
 // Taskgraph's Record & Replay mechanism
 // __kmp_tdg_is_recording: check whether a given TDG is recording
 // status: the tdg's current status
@@ -4264,7 +4255,9 @@ KMP_EXPORT kmp_int32 __kmpc_start_record_task(ident_t *loc, kmp_int32 gtid,
                                               kmp_int32 tdg_id);
 KMP_EXPORT void __kmpc_end_record_task(ident_t *loc, kmp_int32 gtid,
                                        kmp_int32 input_flags, kmp_int32 tdg_id);
-#endif
+KMP_EXPORT void __kmpc_taskgraph(ident_t *loc_ref, kmp_int32 gtid,
+                                 kmp_int32 input_flags, kmp_uint32 tdg_id,
+                                 void (*entry)(void *), void *args);
 /* Interface to fast scalable reduce methods routines */
 
 KMP_EXPORT kmp_int32 __kmpc_reduce_nowait(
diff --git a/openmp/runtime/src/kmp_config.h.cmake b/openmp/runtime/src/kmp_config.h.cmake
index 58bf64112b1a7a7..91bb8a8312e0b98 100644
--- a/openmp/runtime/src/kmp_config.h.cmake
+++ b/openmp/runtime/src/kmp_config.h.cmake
@@ -46,8 +46,6 @@
 #define OMPT_SUPPORT LIBOMP_OMPT_SUPPORT
 #cmakedefine01 LIBOMP_OMPD_SUPPORT
 #define OMPD_SUPPORT LIBOMP_OMPD_SUPPORT
-#cmakedefine01 LIBOMP_OMPX_TASKGRAPH
-#define OMPX_TASKGRAPH LIBOMP_OMPX_TASKGRAPH
 #cmakedefine01 LIBOMP_PROFILING_SUPPORT
 #define OMP_PROFILING_SUPPORT LIBOMP_PROFILING_SUPPORT
 #cmakedefine01 LIBOMP_OMPT_OPTIONAL
diff --git a/openmp/runtime/src/kmp_global.cpp b/openmp/runtime/src/kmp_global.cpp
index 48097fb530d1c66..07e58850602a193 100644
--- a/openmp/runtime/src/kmp_global.cpp
+++ b/openmp/runtime/src/kmp_global.cpp
@@ -559,17 +559,15 @@ int __kmp_nesting_mode = 0;
 int __kmp_nesting_mode_nlevels = 1;
 int *__kmp_nesting_nth_level;
 
-#if OMPX_TASKGRAPH
 // TDG record & replay
 int __kmp_tdg_dot = 0;
 kmp_int32 __kmp_max_tdgs = 100;
 kmp_tdg_info_t **__kmp_global_tdgs = NULL;
-kmp_int32 __kmp_curr_tdg_idx =
+kmp_int32 __kmp_curr_tdg_id =
     0; // Id of the current TDG being recorded or executed
 kmp_int32 __kmp_num_tdg = 0;
 kmp_int32 __kmp_successors_size = 10; // Initial succesor size list for
                                       // recording
 std::atomic<kmp_int32> __kmp_tdg_task_id = 0;
-#endif
 // end of file //
 
diff --git a/openmp/runtime/src/kmp_settings.cpp b/openmp/runtime/src/kmp_settings.cpp
index e731bf45e8eee1f..3d546c818fca9f9 100644
--- a/openmp/runtime/src/kmp_settings.cpp
+++ b/openmp/runtime/src/kmp_settings.cpp
@@ -1252,7 +1252,6 @@ static void __kmp_stg_parse_num_threads(char const *name, char const *value,
   K_DIAG(1, ("__kmp_dflt_team_nth == %d\n", __kmp_dflt_team_nth));
 } // __kmp_stg_parse_num_threads
 
-#if OMPX_TASKGRAPH
 static void __kmp_stg_parse_max_tdgs(char const *name, char const *value,
                                      void *data) {
   __kmp_stg_parse_int(name, value, 0, INT_MAX, &__kmp_max_tdgs);
@@ -1272,7 +1271,6 @@ static void __kmp_stg_print_tdg_dot(kmp_str_buf_t *buffer, char const *name,
                                    void *data) {
   __kmp_stg_print_bool(buffer, name, __kmp_tdg_dot);
 } // __kmp_stg_print_tdg_dot
-#endif
 
 static void __kmp_stg_parse_num_hidden_helper_threads(char const *name,
                                                       char const *value,
@@ -5739,11 +5737,9 @@ static kmp_setting_t __kmp_stg_table[] = {
     {"LIBOMP_NUM_HIDDEN_HELPER_THREADS",
      __kmp_stg_parse_num_hidden_helper_threads,
      __kmp_stg_print_num_hidden_helper_threads, NULL, 0, 0},
-#if OMPX_TASKGRAPH
     {"KMP_MAX_TDGS", __kmp_stg_parse_max_tdgs, __kmp_std_print_max_tdgs, NULL,
      0, 0},
     {"KMP_TDG_DOT", __kmp_stg_parse_tdg_dot, __kmp_stg_print_tdg_dot, NULL, 0, 0},
-#endif
 
 #if OMPT_SUPPORT
     {"OMP_TOOL", __kmp_stg_parse_omp_tool, __kmp_stg_print_omp_tool, NULL, 0,
diff --git a/openmp/runtime/src/kmp_taskdeps.cpp b/openmp/runtime/src/kmp_taskdeps.cpp
index 3b39f503973635b..ce68dd03efe7508 100644
--- a/openmp/runtime/src/kmp_taskdeps.cpp
+++ b/openmp/runtime/src/kmp_taskdeps.cpp
@@ -218,7 +218,6 @@ static kmp_depnode_list_t *__kmp_add_node(kmp_info_t *thread,
 static inline void __kmp_track_dependence(kmp_int32 gtid, kmp_depnode_t *source,
                                           kmp_depnode_t *sink,
                                           kmp_task_t *sink_task) {
-#if OMPX_TASKGRAPH
   kmp_taskdata_t *task_source = KMP_TASK_TO_TASKDATA(source->dn.task);
   kmp_taskdata_t *task_sink = KMP_TASK_TO_TASKDATA(sink_task);
   if (source->dn.task && sink_task) {
@@ -255,7 +254,6 @@ static inline void __kmp_track_dependence(kmp_int32 gtid, kmp_depnode_t *source,
       sink_info->npredecessors++;
     }
   }
-#endif
 #ifdef KMP_SUPPORT_GRAPH_OUTPUT
   kmp_taskdata_t *task_source = KMP_TASK_TO_TASKDATA(source->dn.task);
   // do not use sink->dn.task as that is only filled after the dependences
@@ -294,7 +292,6 @@ __kmp_depnode_link_successor(kmp_int32 gtid, kmp_info_t *thread,
   // link node as successor of list elements
   for (kmp_depnode_list_t *p = plist; p; p = p->next) {
     kmp_depnode_t *dep = p->node;
-#if OMPX_TASKGRAPH
     kmp_tdg_status tdg_status = KMP_TDG_NONE;
     if (task) {
       kmp_taskdata_t *td = KMP_TASK_TO_TASKDATA(task);
@@ -303,13 +300,10 @@ __kmp_depnode_link_successor(kmp_int32 gtid, kmp_info_t *thread,
       if (__kmp_tdg_is_recording(tdg_status))
         __kmp_track_dependence(gtid, dep, node, task);
     }
-#endif
     if (dep->dn.task) {
       KMP_ACQUIRE_DEPNODE(gtid, dep);
       if (dep->dn.task) {
-#if OMPX_TASKGRAPH
         if (!(__kmp_tdg_is_recording(tdg_status)) && task)
-#endif
           __kmp_track_dependence(gtid, dep, node, task);
         dep->dn.successors = __kmp_add_node(thread, dep->dn.successors, node);
         KA_TRACE(40, ("__kmp_process_deps: T#%d adding dependence from %p to "
@@ -332,7 +326,6 @@ static inline kmp_int32 __kmp_depnode_link_successor(kmp_int32 gtid,
   if (!sink)
     return 0;
   kmp_int32 npredecessors = 0;
-#if OMPX_TASKGRAPH
   kmp_tdg_status tdg_status = KMP_TDG_NONE;
   kmp_taskdata_t *td = KMP_TASK_TO_TASKDATA(task);
   if (task) {
@@ -341,21 +334,17 @@ static inline kmp_int32 __kmp_depnode_link_successor(kmp_int32 gtid,
     if (__kmp_tdg_is_recording(tdg_status) && sink->dn.task)
       __kmp_track_dependence(gtid, sink, source, task);
   }
-#endif
   if (sink->dn.task) {
     // synchronously add source to sink' list of successors
     KMP_ACQUIRE_DEPNODE(gtid, sink);
     if (sink->dn.task) {
-#if OMPX_TASKGRAPH
       if (!(__kmp_tdg_is_recording(tdg_status)) && task)
-#endif
         __kmp_track_dependence(gtid, sink, source, task);
       sink->dn.successors = __kmp_add_node(thread, sink->dn.successors, source);
       KA_TRACE(40, ("__kmp_process_deps: T#%d adding dependence from %p to "
                     "%p\n",
                     gtid, KMP_TASK_TO_TASKDATA(sink->dn.task),
                     KMP_TASK_TO_TASKDATA(task)));
-#if OMPX_TASKGRAPH
       if (__kmp_tdg_is_recording(tdg_status)) {
         kmp_taskdata_t *tdd = KMP_TASK_TO_TASKDATA(sink->dn.task);
         if (tdd->is_taskgraph) {
@@ -367,7 +356,6 @@ static inline kmp_int32 __kmp_depnode_link_successor(kmp_int32 gtid,
             npredecessors--;
         }
       }
-#endif
       npredecessors++;
     }
     KMP_RELEASE_DEPNODE(gtid, sink);
@@ -672,7 +660,6 @@ kmp_int32 __kmpc_omp_task_with_deps(ident_t *loc_ref, kmp_int32 gtid,
   kmp_info_t *thread = __kmp_threads[gtid];
   kmp_taskdata_t *current_task = thread->th.th_current_task;
 
-#if OMPX_TASKGRAPH
   // record TDG with deps
   if (new_taskdata->is_taskgraph &&
       __kmp_tdg_is_recording(new_taskdata->tdg->tdg_status)) {
@@ -692,7 +679,7 @@ kmp_int32 __kmpc_omp_task_with_deps(ident_t *loc_ref, kmp_int32 gtid,
 
         __kmp_free(old_record);
 
-        for (kmp_int i = old_size; i < new_size; i++) {
+        for (kmp_uint i = old_size; i < new_size; i++) {
           kmp_int32 *successorsList = (kmp_int32 *)__kmp_allocate(
               __kmp_successors_size * sizeof(kmp_int32));
           new_record[i].task = nullptr;
@@ -713,7 +700,6 @@ kmp_int32 __kmpc_omp_task_with_deps(ident_t *loc_ref, kmp_int32 gtid,
         new_taskdata->td_parent;
     KMP_ATOMIC_INC(&tdg->num_tasks);
   }
-#endif
 #if OMPT_SUPPORT
   if (ompt_enabled.enabled) {
     if (!current_task->ompt_task_info.frame.enter_frame.ptr)
diff --git a/openmp/runtime/src/kmp_taskdeps.h b/openmp/runtime/src/kmp_taskdeps.h
index d2ab515158011a1..a5069e1651b9c5c 100644
--- a/openmp/runtime/src/kmp_taskdeps.h
+++ b/openmp/runtime/src/kmp_taskdeps.h
@@ -93,7 +93,6 @@ extern void __kmpc_give_task(kmp_task_t *ptask, kmp_int32 start);
 
 static inline void __kmp_release_deps(kmp_int32 gtid, kmp_taskdata_t *task) {
 
-#if OMPX_TASKGRAPH
   if (task->is_taskgraph && !(__kmp_tdg_is_recording(task->tdg->tdg_status))) {
     kmp_node_info_t *TaskInfo = &(task->tdg->record_map[task->td_task_id]);
 
@@ -107,7 +106,6 @@ static inline void __kmp_release_deps(kmp_int32 gtid, kmp_taskdata_t *task) {
     }
     return;
   }
-#endif
 
   kmp_info_t *thread = __kmp_threads[gtid];
   kmp_depnode_t *node = task->td_depnode;
@@ -137,10 +135,8 @@ static inline void __kmp_release_deps(kmp_int32 gtid, kmp_taskdata_t *task) {
                 gtid, task));
 
   KMP_ACQUIRE_DEPNODE(gtid, node);
-#if OMPX_TASKGRAPH
   if (!task->is_taskgraph ||
       (task->is_taskgraph && !__kmp_tdg_is_recording(task->tdg->tdg_status)))
-#endif
     node->dn.task =
         NULL; // mark this task as finished, so no new dependencies are generated
   KMP_RELEASE_DEPNODE(gtid, node);
diff --git a/openmp/runtime/src/kmp_tasking.cpp b/openmp/runtime/src/kmp_tasking.cpp
index e8eb6b02650377c..9da5582837d50e6 100644
--- a/openmp/runtime/src/kmp_tasking.cpp
+++ b/openmp/runtime/src/kmp_tasking.cpp
@@ -37,10 +37,9 @@ static void __kmp_alloc_task_deque(kmp_info_t *thread,
 static int __kmp_realloc_task_threads_data(kmp_info_t *thread,
                                            kmp_task_team_t *task_team);
 static void __kmp_bottom_half_finish_proxy(kmp_int32 gtid, kmp_task_t *ptask);
-#if OMPX_TASKGRAPH
+
 static kmp_tdg_info_t *__kmp_find_tdg(kmp_int32 tdg_id);
 int __kmp_taskloop_task(int gtid, void *ptask);
-#endif
 
 #ifdef BUILD_TIED_TASK_STACK
 
@@ -285,11 +284,7 @@ static bool __kmp_task_is_allowed(int gtid, const kmp_int32 is_constrained,
   }
   // Check mutexinoutset dependencies, acquire locks
   kmp_depnode_t *node = tasknew->td_depnode;
-#if OMPX_TASKGRAPH
   if (!tasknew->is_taskgraph && UNLIKELY(node && (node->dn.mtx_num_locks > 0))) {
-#else
-  if (UNLIKELY(node && (node->dn.mtx_num_locks > 0))) {
-#endif
     for (int i = 0; i < node->dn.mtx_num_locks; ++i) {
       KMP_DEBUG_ASSERT(node->dn.mtx_locks[i] != NULL);
       if (__kmp_test_lock(node->dn.mtx_locks[i], gtid))
@@ -896,17 +891,14 @@ static void __kmp_free_task(kmp_int32 gtid, kmp_taskdata_t *taskdata,
   task->data2.priority = 0;
 
   taskdata->td_flags.freed = 1;
-#if OMPX_TASKGRAPH
   // do not free tasks in taskgraph
   if (!taskdata->is_taskgraph) {
-#endif
 // deallocate the taskdata and shared variable blocks associated with this task
 #if USE_FAST_MEMORY
   __kmp_fast_free(thread, taskdata);
 #else /* ! USE_FAST_MEMORY */
   __kmp_thread_free(thread, taskdata);
 #endif
-#if OMPX_TASKGRAPH
   } else {
     taskdata->td_flags.complete = 0;
     taskdata->td_flags.started = 0;
@@ -922,7 +914,6 @@ static void __kmp_free_task(kmp_int32 gtid, kmp_taskdata_t *taskdata,
     // start at one because counts current task and children
     KMP_ATOMIC_ST_RLX(&taskdata->td_allocated_child_tasks, 1);
   }
-#endif
 
   KA_TRACE(20, ("__kmp_free_task: T#%d freed task %p\n", gtid, taskdata));
 }
@@ -1010,10 +1001,8 @@ static bool __kmp_track_children_task(kmp_taskdata_t *taskdata) {
         flags.detachable == TASK_DETACHABLE || flags.hidden_helper;
   ret = ret ||
         KMP_ATOMIC_LD_ACQ(&taskdata->td_parent->td_incomplete_child_tasks) > 0;
-#if OMPX_TASKGRAPH
   if (taskdata->td_taskgroup && taskdata->is_taskgraph)
     ret = ret || KMP_ATOMIC_LD_ACQ(&taskdata->td_taskgroup->count) > 0;
-#endif
   return ret;
 }
 
@@ -1033,10 +1022,8 @@ static void __kmp_task_finish(kmp_int32 gtid, kmp_task_t *task,
   kmp_info_t *thread = __kmp_threads[gtid];
   kmp_task_team_t *task_team =
       thread->th.th_task_team; // might be NULL for serial teams...
-#if OMPX_TASKGRAPH
   // to avoid seg fault when we need to access taskdata->td_flags after free when using vanilla taskloop
   bool is_taskgraph;
-#endif
 #if KMP_DEBUG
   kmp_int32 children = 0;
 #endif
@@ -1046,9 +1033,7 @@ static void __kmp_task_finish(kmp_int32 gtid, kmp_task_t *task,
 
   KMP_DEBUG_ASSERT(taskdata->td_flags.tasktype == TASK_EXPLICIT);
 
-#if OMPX_TASKGRAPH
   is_taskgraph = taskdata->is_taskgraph;
-#endif
 
 // Pop task from stack if tied
 #ifdef BUILD_TIED_TASK_STACK
@@ -1156,9 +1141,7 @@ static void __kmp_task_finish(kmp_int32 gtid, kmp_task_t *task,
 
   if (completed) {
     taskdata->td_flags.complete = 1; // mark the task as completed
-#if OMPX_TASKGRAPH
     taskdata->td_flags.onced = 1; // mark the task as ran once already
-#endif
 
 #if OMPT_SUPPORT
     // This is not a detached task, we are done here
@@ -1175,11 +1158,7 @@ static void __kmp_task_finish(kmp_int32 gtid, kmp_task_t *task,
 #endif
           KMP_ATOMIC_DEC(&taskdata->td_parent->td_incomplete_child_tasks);
       KMP_DEBUG_ASSERT(children >= 0);
-#if OMPX_TASKGRAPH
       if (taskdata->td_taskgroup && !taskdata->is_taskgraph)
-#else
-      if (taskdata->td_taskgroup)
-#endif
         KMP_ATOMIC_DEC(&taskdata->td_taskgroup->count);
     } else if (task_team && (task_team->tt.tt_found_proxy_tasks ||
                              task_team->tt.tt_hidden_helper_task_encountered)) {
@@ -1218,7 +1197,6 @@ static void __kmp_task_finish(kmp_int32 gtid, kmp_task_t *task,
   // KMP_DEBUG_ASSERT( resumed_task->td_flags.executing == 0 );
   resumed_task->td_flags.executing = 1; // resume previous task
 
-#if OMPX_TASKGRAPH
   if (is_taskgraph && __kmp_track_children_task(taskdata) &&
       taskdata->td_taskgroup) {
     // TDG: we only release taskgroup barrier here because
@@ -1229,7 +1207,6 @@ static void __kmp_task_finish(kmp_int32 gtid, kmp_task_t *task,
     // non-TDG implementation because we never reuse a task(data) structure
     KMP_ATOMIC_DEC(&taskdata->td_taskgroup->count);
   }
-#endif
 
   KA_TRACE(
       10, ("__kmp_task_finish(exit): T#%d finished task %p, resuming task %p\n",
@@ -1347,9 +1324,7 @@ void __kmp_init_implicit_task(ident_t *loc_ref, kmp_info_t *this_thr,
   task->td_flags.executing = 1;
   task->td_flags.complete = 0;
   task->td_flags.freed = 0;
-#if OMPX_TASKGRAPH
   task->td_flags.onced = 0;
-#endif
 
   task->td_depnode = NULL;
   task->td_last_tied = task;
@@ -1386,9 +1361,7 @@ void __kmp_finish_implicit_task(kmp_info_t *thread) {
   if (task->td_dephash) {
     int children;
     task->td_flags.complete = 1;
-#if OMPX_TASKGRAPH
     task->td_flags.onced = 1;
-#endif
     children = KMP_ATOMIC_LD_ACQ(&task->td_incomplete_child_tasks);
     kmp_tasking_flags_t flags_old = task->td_flags;
     if (children == 0 && flags_old.complete == 1) {
@@ -1618,9 +1591,7 @@ kmp_task_t *__kmp_task_alloc(ident_t *loc_ref, kmp_int32 gtid,
   taskdata->td_flags.executing = 0;
   taskdata->td_flags.complete = 0;
   taskdata->td_flags.freed = 0;
-#if OMPX_TASKGRAPH
   taskdata->td_flags.onced = 0;
-#endif
   KMP_ATOMIC_ST_RLX(&taskdata->td_incomplete_child_tasks, 0);
   // start at one because counts current task and children
   KMP_ATOMIC_ST_RLX(&taskdata->td_allocated_child_tasks, 1);
@@ -1656,15 +1627,13 @@ kmp_task_t *__kmp_task_alloc(ident_t *loc_ref, kmp_int32 gtid,
     }
   }
 
-#if OMPX_TASKGRAPH
-  kmp_tdg_info_t *tdg = __kmp_find_tdg(__kmp_curr_tdg_idx);
+  kmp_tdg_info_t *tdg = __kmp_find_tdg(__kmp_curr_tdg_id);
   if (tdg && __kmp_tdg_is_recording(tdg->tdg_status) &&
       (task_entry != (kmp_routine_entry_t)__kmp_taskloop_task)) {
     taskdata->is_taskgraph = 1;
-    taskdata->tdg = __kmp_global_tdgs[__kmp_curr_tdg_idx];
+    taskdata->tdg = tdg;
     taskdata->td_task_id = KMP_ATOMIC_INC(&__kmp_tdg_task_id);
   }
-#endif
   KA_TRACE(20, ("__kmp_task_alloc(exit): T#%d created task %p parent=%p\n",
                 gtid, taskdata, taskdata->td_parent));
 
@@ -2008,7 +1977,6 @@ kmp_int32 __kmp_omp_task(kmp_int32 gtid, kmp_task_t *new_task,
                          bool serialize_immediate) {
   kmp_taskdata_t *new_taskdata = KMP_TASK_TO_TASKDATA(new_task);
 
-#if OMPX_TASKGRAPH
   if (new_taskdata->is_taskgraph &&
       __kmp_tdg_is_recording(new_taskdata->tdg->tdg_status)) {
     kmp_tdg_info_t *tdg = new_taskdata->tdg;
@@ -2029,7 +1997,7 @@ kmp_int32 __kmp_omp_task(kmp_int32 gtid, kmp_task_t *new_task,
 
         __kmp_free(old_record);
 
-        for (kmp_int i = old_size; i < new_size; i++) {
+        for (kmp_uint i = old_size; i < new_size; i++) {
           kmp_int32 *successorsList = (kmp_int32 *)__kmp_allocate(
               __kmp_successors_size * sizeof(kmp_int32));
           new_record[i].task = nullptr;
@@ -2053,7 +2021,6 @@ kmp_int32 __kmp_omp_task(kmp_int32 gtid, kmp_task_t *new_task,
       KMP_ATOMIC_INC(&tdg->num_tasks);
     }
   }
-#endif
 
   /* Should we execute the new task or queue it? For now, let's just always try
      to queue it.  If the queue fills up, then we'll execute it.  */
@@ -2570,17 +2537,15 @@ the reduction either does not use omp_orig object, or the omp_orig is accessible
 without help of the runtime library.
 */
 void *__kmpc_task_reduction_init(int gtid, int num, void *data) {
-#if OMPX_TASKGRAPH
-  kmp_tdg_info_t *tdg = __kmp_find_tdg(__kmp_curr_tdg_idx);
+  kmp_tdg_info_t *tdg = __kmp_find_tdg(__kmp_curr_tdg_id);
   if (tdg && __kmp_tdg_is_recording(tdg->tdg_status)) {
-    kmp_tdg_info_t *this_tdg = __kmp_global_tdgs[__kmp_curr_tdg_idx];
+    kmp_tdg_info_t *this_tdg = __kmp_find_tdg(__kmp_curr_tdg_id);
     this_tdg->rec_taskred_data =
         __kmp_allocate(sizeof(kmp_task_red_input_t) * num);
     this_tdg->rec_num_taskred = num;
     KMP_MEMCPY(this_tdg->rec_taskred_data, data,
                sizeof(kmp_task_red_input_t) * num);
   }
-#endif
   return __kmp_task_reduction_init(gtid, num, (kmp_task_red_input_t *)data);
 }
 
@@ -2597,17 +2562,14 @@ Note: this entry supposes the optional compiler-generated initializer routine
 has two parameters, pointer to object to be initialized and pointer to omp_orig
 */
 void *__kmpc_taskred_init(int gtid, int num, void *data) {
-#if OMPX_TASKGRAPH
-  kmp_tdg_info_t *tdg = __kmp_find_tdg(__kmp_curr_tdg_idx);
+  kmp_tdg_info_t *tdg = __kmp_find_tdg(__kmp_curr_tdg_id);
   if (tdg && __kmp_tdg_is_recording(tdg->tdg_status)) {
-    kmp_tdg_info_t *this_tdg = __kmp_global_tdgs[__kmp_curr_tdg_idx];
-    this_tdg->rec_taskred_data =
+    tdg->rec_taskred_data =
         __kmp_allocate(sizeof(kmp_task_red_input_t) * num);
-    this_tdg->rec_num_taskred = num;
-    KMP_MEMCPY(this_tdg->rec_taskred_data, data,
+    tdg->rec_num_taskred = num;
+    KMP_MEMCPY(tdg->rec_taskred_data, data,
                sizeof(kmp_task_red_input_t) * num);
   }
-#endif
   return __kmp_task_reduction_init(gtid, num, (kmp_taskred_input_t *)data);
 }
 
@@ -2654,17 +2616,15 @@ void *__kmpc_task_reduction_get_th_data(int gtid, void *tskgrp, void *data) {
   kmp_int32 num = tg->reduce_num_data;
   kmp_int32 tid = thread->th.th_info.ds.ds_tid;
 
-#if OMPX_TASKGRAPH
   if ((thread->th.th_current_task->is_taskgraph) &&
       (!__kmp_tdg_is_recording(
-          __kmp_global_tdgs[__kmp_curr_tdg_idx]->tdg_status))) {
+          __kmp_find_tdg(__kmp_curr_tdg_id)->tdg_status))) {
     tg = thread->th.th_current_task->td_taskgroup;
     KMP_ASSERT(tg != NULL);
     KMP_ASSERT(tg->reduce_data != NULL);
     arr = (kmp_taskred_data_t *)(tg->reduce_data);
     num = tg->reduce_num_data;
   }
-#endif
 
   KMP_ASSERT(data != NULL);
   while (tg != NULL) {
@@ -4442,9 +4402,7 @@ static void __kmp_first_top_half_finish_proxy(kmp_taskdata_t *taskdata) {
   KMP_DEBUG_ASSERT(taskdata->td_flags.freed == 0);
 
   taskdata->td_flags.complete = 1; // mark the task as completed
-#if OMPX_TASKGRAPH
   taskdata->td_flags.onced = 1;
-#endif
 
   if (taskdata->td_taskgroup)
     KMP_ATOMIC_DEC(&taskdata->td_taskgroup->count);
@@ -4646,11 +4604,8 @@ void __kmp_fulfill_event(kmp_event_t *event) {
 // taskloop_recur: used only when dealing with taskgraph,
 //      indicating whether we need to update task->td_task_id
 // returns:  a pointer to the allocated kmp_task_t structure (task).
-kmp_task_t *__kmp_task_dup_alloc(kmp_info_t *thread, kmp_task_t *task_src
-#if OMPX_TASKGRAPH
-                                 , int taskloop_recur
-#endif
-) {
+kmp_task_t *__kmp_task_dup_alloc(kmp_info_t *thread, kmp_task_t *task_src,
+                                 int taskloop_recur) {
   kmp_task_t *task;
   kmp_taskdata_t *taskdata;
   kmp_taskdata_t *taskdata_src = KMP_TASK_TO_TASKDATA(task_src);
@@ -4678,15 +4633,11 @@ kmp_task_t *__kmp_task_dup_alloc(kmp_info_t *thread, kmp_task_t *task_src
   task = KMP_TASKDATA_TO_TASK(taskdata);
 
   // Initialize new task (only specific fields not affected by memcpy)
-#if OMPX_TASKGRAPH
   if (!taskdata->is_taskgraph || taskloop_recur)
     taskdata->td_task_id = KMP_GEN_TASK_ID();
   else if (taskdata->is_taskgraph &&
            __kmp_tdg_is_recording(taskdata_src->tdg->tdg_status))
     taskdata->td_task_id = KMP_ATOMIC_INC(&__kmp_tdg_task_id);
-#else
-  taskdata->td_task_id = KMP_GEN_TASK_ID();
-#endif
   if (task->shareds != NULL) { // need setup shareds pointer
     shareds_offset = (char *)task_src->shareds - (char *)taskdata_src;
     task->shareds = &((char *)taskdata)[shareds_offset];
@@ -4914,11 +4865,7 @@ void __kmp_taskloop_linear(ident_t *loc, int gtid, kmp_task_t *task,
       }
     }
 
-#if OMPX_TASKGRAPH
     next_task = __kmp_task_dup_alloc(thread, task, /* taskloop_recur */ 0);
-#else
-    next_task = __kmp_task_dup_alloc(thread, task); // allocate new task
-#endif
 
     kmp_taskdata_t *next_taskdata = KMP_TASK_TO_TASKDATA(next_task);
     kmp_taskloop_bounds_t next_task_bounds =
@@ -5116,12 +5063,8 @@ void __kmp_taskloop_recur(ident_t *loc, int gtid, kmp_task_t *task,
   lb1 = ub0 + st;
 
   // create pattern task for 2nd half of the loop
-#if OMPX_TASKGRAPH
   next_task = __kmp_task_dup_alloc(thread, task,
                                    /* taskloop_recur */ 1);
-#else
-  next_task = __kmp_task_dup_alloc(thread, task); // duplicate the task
-#endif
   // adjust lower bound (upper bound is not changed) for the 2nd half
   *(kmp_uint64 *)((char *)next_task + lower_offset) = lb1;
   if (ptask_dup != NULL) // construct firstprivates, etc.
@@ -5154,11 +5097,9 @@ void __kmp_taskloop_recur(ident_t *loc, int gtid, kmp_task_t *task,
   p->codeptr_ra = codeptr_ra;
 #endif
 
-#if OMPX_TASKGRAPH
   kmp_taskdata_t *new_task_data = KMP_TASK_TO_TASKDATA(new_task);
   new_task_data->tdg = taskdata->tdg;
   new_task_data->is_taskgraph = 0;
-#endif
 
 #if OMPT_SUPPORT
   // schedule new task with correct return address for OMPT events
@@ -5199,9 +5140,7 @@ static void __kmp_taskloop(ident_t *loc, int gtid, kmp_task_t *task, int if_val,
     __kmpc_taskgroup(loc, gtid);
   }
 
-#if OMPX_TASKGRAPH
   KMP_ATOMIC_DEC(&__kmp_tdg_task_id);
-#endif
   // =========================================================================
   // calculate loop parameters
   kmp_taskloop_bounds_t task_bounds(task, lb, ub);
@@ -5450,7 +5389,24 @@ bool __kmpc_omp_has_task_team(kmp_int32 gtid) {
   return taskdata->td_task_team != NULL;
 }
 
-#if OMPX_TASKGRAPH
+// __kmpc_taskgraph: record or replay taskgraph
+// loc_ref:     Location of TDG, not used yet
+// gtid:        Global Thread ID of the encountering thread
+// input_flags: Flags associated with the TDG
+// tdg_id:      ID of the TDG to record, for now, incremental integer
+// entry:       Pointer to the entry function
+// args:        Pointer to the function arguments
+void __kmpc_taskgraph(ident_t *loc_ref, kmp_int32 gtid, kmp_int32 input_flags,
+                      kmp_uint32 tdg_id, void (*entry)(void *), void *args) {
+  kmp_int32 res = __kmpc_start_record_task(loc_ref, gtid, input_flags, tdg_id);
+  // When res = 1, we either start recording or only execute tasks
+  // without recording. Need to execute entry function in both cases.
+  if (res)
+    entry(args);
+
+  __kmpc_end_record_task(loc_ref, gtid, input_flags, tdg_id);
+}
+
 // __kmp_find_tdg: identify a TDG through its ID
 // gtid:   Global Thread ID
 // tdg_id: ID of the TDG
@@ -5465,9 +5421,15 @@ static kmp_tdg_info_t *__kmp_find_tdg(kmp_int32 tdg_id) {
     __kmp_global_tdgs = (kmp_tdg_info_t **)__kmp_allocate(
         sizeof(kmp_tdg_info_t *) * __kmp_max_tdgs);
 
-  if ((__kmp_global_tdgs[tdg_id]) &&
-      (__kmp_global_tdgs[tdg_id]->tdg_status != KMP_TDG_NONE))
-    res = __kmp_global_tdgs[tdg_id];
+  for (kmp_int32 i = 0; i < __kmp_num_tdg; ++i) {
+    if ((__kmp_global_tdgs[i]) &&
+        (__kmp_global_tdgs[i]->tdg_id == tdg_id) &&
+        (__kmp_global_tdgs[i]->tdg_status != KMP_TDG_NONE)) {
+      res = __kmp_global_tdgs[i];
+      __kmp_curr_tdg_id = tdg_id;
+      break;
+    }
+  }
   return res;
 }
 
@@ -5475,7 +5437,7 @@ static kmp_tdg_info_t *__kmp_find_tdg(kmp_int32 tdg_id) {
 // tdg:    ID of the TDG
 void __kmp_print_tdg_dot(kmp_tdg_info_t *tdg) {
   kmp_int32 tdg_id = tdg->tdg_id;
-  KA_TRACE(10, ("__kmp_print_tdg_dot(enter): T#%d tdg_id=%d \n", gtid, tdg_id));
+  KA_TRACE(10, ("__kmp_print_tdg_dot(enter): T#%d tdg_id=%d \n", __kmp_get_gtid(), tdg_id));
 
   char file_name[20];
   sprintf(file_name, "tdg_%d.dot", tdg_id);
@@ -5501,10 +5463,10 @@ void __kmp_print_tdg_dot(kmp_tdg_info_t *tdg) {
     }
   }
   fprintf(tdg_file, "}");
-  KA_TRACE(10, ("__kmp_print_tdg_dot(exit): T#%d tdg_id=%d \n", gtid, tdg_id));
+  KA_TRACE(10, ("__kmp_print_tdg_dot(exit): T#%d tdg_id=%d \n", __kmp_get_gtid(), tdg_id));
 }
 
-// __kmp_start_record: launch the execution of a previous
+// __kmp_exec_tdg: launch the execution of a previous
 // recorded TDG
 // gtid:   Global Thread ID
 // tdg:    ID of the TDG
@@ -5566,7 +5528,7 @@ static inline void __kmp_start_record(kmp_int32 gtid,
                                       kmp_int32 tdg_id) {
   kmp_tdg_info_t *tdg =
       (kmp_tdg_info_t *)__kmp_allocate(sizeof(kmp_tdg_info_t));
-  __kmp_global_tdgs[__kmp_curr_tdg_idx] = tdg;
+  __kmp_global_tdgs[__kmp_num_tdg-1] = tdg;
   // Initializing the TDG structure
   tdg->tdg_id = tdg_id;
   tdg->map_size = INIT_MAPSIZE;
@@ -5591,7 +5553,7 @@ static inline void __kmp_start_record(kmp_int32 gtid,
     KMP_ATOMIC_ST_RLX(&this_record_map[i].npredecessors_counter, 0);
   }
 
-  __kmp_global_tdgs[__kmp_curr_tdg_idx]->record_map = this_record_map;
+  tdg->record_map = this_record_map;
 }
 
 // __kmpc_start_record_task: Wrapper around __kmp_start_record to mark
@@ -5625,10 +5587,14 @@ kmp_int32 __kmpc_start_record_task(ident_t *loc_ref, kmp_int32 gtid,
     __kmp_exec_tdg(gtid, tdg);
     res = 0;
   } else {
-    __kmp_curr_tdg_idx = tdg_id;
-    KMP_DEBUG_ASSERT(__kmp_curr_tdg_idx < __kmp_max_tdgs);
-    __kmp_start_record(gtid, flags, tdg_id);
-    __kmp_num_tdg++;
+    if (__kmp_num_tdg < __kmp_max_tdgs) {
+      __kmp_curr_tdg_id = tdg_id;
+      __kmp_num_tdg++;
+      KMP_DEBUG_ASSERT(__kmp_num_tdg <= __kmp_max_tdgs);
+      __kmp_start_record(gtid, flags, tdg_id);
+    }
+    // if no TDG found, need to execute the task
+    // even not recording
     res = 1;
   }
   KA_TRACE(10, ("__kmpc_start_record_task(exit): T#%d TDG %d starts to %s\n",
@@ -5701,5 +5667,4 @@ void __kmpc_end_record_task(ident_t *loc_ref, kmp_int32 gtid,
   KA_TRACE(10, ("__kmpc_end_record_task(exit): T#%d loc=%p finished recording"
                 " tdg=%d, its status is now READY\n",
                 gtid, loc_ref, tdg_id));
-}
-#endif
+}
\ No newline at end of file
diff --git a/openmp/runtime/test/CMakeLists.txt b/openmp/runtime/test/CMakeLists.txt
index a7790804542b7ee..05b517fb920fdc7 100644
--- a/openmp/runtime/test/CMakeLists.txt
+++ b/openmp/runtime/test/CMakeLists.txt
@@ -30,7 +30,6 @@ update_test_compiler_features()
 pythonize_bool(LIBOMP_USE_HWLOC)
 pythonize_bool(LIBOMP_OMPT_SUPPORT)
 pythonize_bool(LIBOMP_OMPT_OPTIONAL)
-pythonize_bool(LIBOMP_OMPX_TASKGRAPH)
 pythonize_bool(LIBOMP_HAVE_LIBM)
 pythonize_bool(LIBOMP_HAVE_LIBATOMIC)
 pythonize_bool(OPENMP_STANDALONE_BUILD)
diff --git a/openmp/runtime/test/lit.cfg b/openmp/runtime/test/lit.cfg
index 650d3853e851112..331be0431774660 100644
--- a/openmp/runtime/test/lit.cfg
+++ b/openmp/runtime/test/lit.cfg
@@ -103,9 +103,6 @@ if config.has_ompt:
     # for callback.h
     config.test_flags += " -I " + config.test_source_root + "/ompt"
 
-if config.has_ompx_taskgraph:
-    config.available_features.add("ompx_taskgraph")
-
 if 'Linux' in config.operating_system:
     config.available_features.add("linux")
 
diff --git a/openmp/runtime/test/lit.site.cfg.in b/openmp/runtime/test/lit.site.cfg.in
index d6c259280619be9..45a18b480130f6a 100644
--- a/openmp/runtime/test/lit.site.cfg.in
+++ b/openmp/runtime/test/lit.site.cfg.in
@@ -15,7 +15,6 @@ config.operating_system = "@CMAKE_SYSTEM_NAME@"
 config.hwloc_library_dir = "@LIBOMP_HWLOC_LIBRARY_DIR@"
 config.using_hwloc = @LIBOMP_USE_HWLOC@
 config.has_ompt = @LIBOMP_OMPT_SUPPORT@ and @LIBOMP_OMPT_OPTIONAL@
-config.has_ompx_taskgraph = @LIBOMP_OMPX_TASKGRAPH@
 config.has_libm = @LIBOMP_HAVE_LIBM@
 config.has_libatomic = @LIBOMP_HAVE_LIBATOMIC@
 config.is_standalone_build = @OPENMP_STANDALONE_BUILD@
diff --git a/openmp/runtime/test/tasking/omp_record_replay.cpp b/openmp/runtime/test/tasking/omp_record_replay.cpp
index 69ad98003a0d699..54e8090c486ad54 100644
--- a/openmp/runtime/test/tasking/omp_record_replay.cpp
+++ b/openmp/runtime/test/tasking/omp_record_replay.cpp
@@ -1,4 +1,3 @@
-// REQUIRES: ompx_taskgraph
 // RUN: %libomp-cxx-compile-and-run
 #include <iostream>
 #include <cassert>
@@ -29,14 +28,12 @@ int main() {
   #pragma omp parallel
   #pragma omp single
   for (int iter = 0; iter < NT; ++iter) {
-    int gtid = __kmpc_global_thread_num(nullptr);
-    int res =  __kmpc_start_record_task(nullptr, gtid, /* kmp_tdg_flags */ 0, /* tdg_id */0);
-    if (res) {
+    #pragma ompx taskgraph
+    {
       num_tasks++;
       #pragma omp task 
       func(&num_exec);
     }
-    __kmpc_end_record_task(nullptr, gtid, /* kmp_tdg_flags */0, /* tdg_id */0);
   }
 
   assert(num_tasks==1);
diff --git a/openmp/runtime/test/tasking/omp_record_replay_deps.cpp b/openmp/runtime/test/tasking/omp_record_replay_deps.cpp
index 9b6b370b30efc15..c370ad34b5528bf 100644
--- a/openmp/runtime/test/tasking/omp_record_replay_deps.cpp
+++ b/openmp/runtime/test/tasking/omp_record_replay_deps.cpp
@@ -1,4 +1,3 @@
-// REQUIRES: ompx_taskgraph
 // RUN: %libomp-cxx-compile-and-run
 #include <iostream>
 #include <cassert>
@@ -43,9 +42,8 @@ int main() {
   #pragma omp parallel
   #pragma omp single
   for (int iter = 0; iter < NT; ++iter) {
-    int gtid = __kmpc_global_thread_num(nullptr);
-    int res =  __kmpc_start_record_task(nullptr, gtid, /* kmp_tdg_flags */0, /* tdg_id */0);
-    if (res) {
+    #pragma ompx taskgraph
+    {
       #pragma omp task depend(out:y)
       add();
       #pragma omp task depend(out:x)
@@ -53,7 +51,6 @@ int main() {
       #pragma omp task depend(in:x,y)
       mult();
     }
-    __kmpc_end_record_task(nullptr, gtid, /* kmp_tdg_flags */0, /* tdg_id */0);
   }
   assert(val==0);
 
diff --git a/openmp/runtime/test/tasking/omp_record_replay_multiTDGs.cpp b/openmp/runtime/test/tasking/omp_record_replay_multiTDGs.cpp
index 03252843689c401..282625ddb47826c 100644
--- a/openmp/runtime/test/tasking/omp_record_replay_multiTDGs.cpp
+++ b/openmp/runtime/test/tasking/omp_record_replay_multiTDGs.cpp
@@ -1,4 +1,3 @@
-// REQUIRES: ompx_taskgraph
 // RUN: %libomp-cxx-compile-and-run
 #include <iostream>
 #include <cassert>
@@ -42,9 +41,8 @@ int main() {
   #pragma omp parallel
   #pragma omp single
   for (int iter = 0; iter < NT; ++iter) {
-    int gtid = __kmpc_global_thread_num(nullptr);
-    int res =  __kmpc_start_record_task(nullptr, gtid, /* kmp_tdg_flags */ 0, /* tdg_id */0);
-    if (res) {
+    #pragma ompx taskgraph
+    {
       num_tasks++;
       #pragma omp task depend(out:y)
       add();
@@ -53,9 +51,8 @@ int main() {
       #pragma omp task depend(in:x,y)
       mult();
     }
-    __kmpc_end_record_task(nullptr, gtid, /* kmp_tdg_flags */0, /* tdg_id */0);
-    res =  __kmpc_start_record_task(nullptr, gtid, /* kmp_tdg_flags */ 0, /* tdg_id */1);
-    if (res) {
+    #pragma ompx taskgraph
+    {
       num_tasks++;
       #pragma omp task depend(out:y)
       add();
@@ -64,7 +61,6 @@ int main() {
       #pragma omp task depend(in:x,y)
       mult();
     }
-    __kmpc_end_record_task(nullptr, gtid, /* kmp_tdg_flags */0, /* tdg_id */1);
   }
 
   assert(num_tasks==2);
diff --git a/openmp/runtime/test/tasking/omp_record_replay_print_dot.cpp b/openmp/runtime/test/tasking/omp_record_replay_print_dot.cpp
index 2fe55f081542903..522068c359e6a59 100644
--- a/openmp/runtime/test/tasking/omp_record_replay_print_dot.cpp
+++ b/openmp/runtime/test/tasking/omp_record_replay_print_dot.cpp
@@ -1,4 +1,3 @@
-// REQUIRES: ompx_taskgraph
 // RUN: %libomp-cxx-compile-and-run
 #include <iostream>
 #include <fstream>
@@ -26,7 +25,7 @@ void func(int *num_exec) {
 std::string tdg_string= "digraph TDG {\n"
 "   compound=true\n"
 "   subgraph cluster {\n"
-"      label=TDG_0\n"
+"      label=TDG_33263\n"
 "      0[style=bold]\n"
 "      1[style=bold]\n"
 "      2[style=bold]\n"
@@ -47,9 +46,8 @@ int main() {
   #pragma omp parallel
   #pragma omp single
   {
-    int gtid = __kmpc_global_thread_num(nullptr);
-    int res = __kmpc_start_record_task(nullptr, gtid, /* kmp_tdg_flags */ 0, /* tdg_id */ 0);
-    if (res) {
+    #pragma ompx taskgraph
+    {
       #pragma omp task depend(out : x)
       func(&num_exec);
       #pragma omp task depend(in : x) depend(out : y)
@@ -59,13 +57,11 @@ int main() {
       #pragma omp task depend(in : y)
       func(&num_exec);
     }
-
-    __kmpc_end_record_task(nullptr, gtid, /* kmp_tdg_flags */ 0, /* tdg_id */ 0);
   }
 
   assert(num_exec == 4);
 
-  std::ifstream tdg_file("tdg_0.dot");
+  std::ifstream tdg_file("tdg_33263.dot");
   assert(tdg_file.is_open());
 
   std::stringstream tdg_file_stream;
diff --git a/openmp/runtime/test/tasking/omp_record_replay_taskloop.cpp b/openmp/runtime/test/tasking/omp_record_replay_taskloop.cpp
index 3d88faeeb28eea1..dd814ff36e9e7a1 100644
--- a/openmp/runtime/test/tasking/omp_record_replay_taskloop.cpp
+++ b/openmp/runtime/test/tasking/omp_record_replay_taskloop.cpp
@@ -1,4 +1,3 @@
-// REQUIRES: ompx_taskgraph
 // RUN: %libomp-cxx-compile-and-run
 #include <iostream>
 #include <cassert>
@@ -30,16 +29,14 @@ int main() {
   #pragma omp parallel
   #pragma omp single
   for (int iter = 0; iter < NT; ++iter) {
-    int gtid = __kmpc_global_thread_num(nullptr);
-    int res =  __kmpc_start_record_task(nullptr, gtid, /* kmp_tdg_flags */0,  /* tdg_id */0);
-    if (res) {
+    #pragma ompx taskgraph
+    {
       num_tasks++;
       #pragma omp taskloop reduction(+:sum) num_tasks(4096)
       for (int i = 0; i < N; ++i) {
         sum += array[i];
       }
     }
-    __kmpc_end_record_task(nullptr, gtid, /* kmp_tdg_flags */0,  /* tdg_id */0);
   }
   assert(sum==N*NT);
   assert(num_tasks==1);



More information about the Openmp-commits mailing list