[Mlir-commits] [mlir] [mlir][Pass] Add new FileTreeIRPrinterConfig (PR #67840)

Mon Oct 2 13:56:43 PDT 2023

https://github.com/christopherbate updated https://github.com/llvm/llvm-project/pull/67840

>From 6ab51980310c4a2c29b778d97af7002373f3082d Mon Sep 17 00:00:00 2001
From: Christopher Bate <cbate at nvidia.com>
Date: Fri, 29 Sep 2023 11:59:32 -0600
Subject: [PATCH] [Pass] Add new FileTreeIRPrinterConfig
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This change expands the existing instrumentation that prints the IR before/
after each pass to an output stream (usually stderr). It adds a new
configuration to the instrumentation that will print the output of each pass to
a separate file. The files will be organized into a directory tree rooted at a
specified directory. For existing tools, a CL option `-mlir-print-ir-tree-dir`
is added to specify this directory and activate the new printing config.

The created directory tree mirrors the nesting structure of the IR. For example,
if the IR is congruent to the pass-pipeline
"builtin.module(pass1,pass2,func.func(pass3,pass4))", and
`-mlir-print-ir-tree-dir=/tmp/pipeline_output`, then then the tree file tree
created will look like:

```
/tmp/pass_output
├── builtin_module_the_symbol_name
│   ├── 0_pass1.mlir
│   ├── 1_pass2.mlir
│   ├── func_func_my_func_name
│   │   ├── 2_pass3.mlir
│   │   ├── 3_pass4.mlir
│   ├── func_func_my_other_func_name
│   │   ├── 4_pass3.mlir
│   │   ├── 5_pass4.mlir
```

The subdirectories are given names that reflect the parent operation name
and symbol name (if present). The output MLIR files are prefixed using an
atomic counter to indicate the order the passes were printed in and to
prevent any potential name collisions.
---
 mlir/docs/PassManagement.md               |  29 +++++
 mlir/include/mlir/Pass/PassManager.h      |  38 +++++-
 mlir/lib/Pass/IRPrinting.cpp              | 144 ++++++++++++++++++++++
 mlir/lib/Pass/PassManagerOptions.cpp      |  11 ++
 mlir/test/Pass/ir-printing-file-tree.mlir |  28 +++++
 5 files changed, 249 insertions(+), 1 deletion(-)
 create mode 100644 mlir/test/Pass/ir-printing-file-tree.mlir

diff --git a/mlir/docs/PassManagement.md b/mlir/docs/PassManagement.md
index 9a7cfd1f9bebc35..fc9dc4c21f32d34 100644
--- a/mlir/docs/PassManagement.md
+++ b/mlir/docs/PassManagement.md
@@ -1302,6 +1302,35 @@ func.func @simple_constant() -> (i32, i32) {
 }
 ```
 
+*   `mlir-print-ir-tree-dir=(directory path)`
+    *   Without setting this option, the IR printed by the instrumentation will
+        be printed to `stderr`. If you provide a directory using this option,
+        the output corresponding to each pass will be printed to a file in the
+        directory tree rooted at `(directory path)`. The path created for each
+        pass reflects the nesting structure of the IR and the pass pipeline.
+    *   The below example illustrates the file tree created by running a pass
+        pipeline on IR that has two `func.func` located within two nested
+        `builtin.module` ops.
+    *   The subdirectories are given names that reflect the parent op names and
+        the symbol names for those ops (if present).
+
+```
+$ pipeline="builtin.module(pass1,pass2,func.func(pass3,pass4))"
+$ mlir-opt foo.mlir -pass-pipeline="$pipeline" -mlir-print-ir-tree-dir=/tmp/pipeline_output
+$ tree /tmp/pipeline_output
+
+/tmp/pass_output
+├── builtin_module_the_symbol_name
+│   ├── 0_pass1.mlir
+│   ├── 1_pass2.mlir
+│   ├── func_func_my_func_name
+│   │   ├── 2_pass3.mlir
+│   │   ├── 3_pass4.mlir
+│   ├── func_func_my_other_func_name
+│   │   ├── 4_pass3.mlir
+│   │   ├── 5_pass4.mlir
+```
+
 ## Crash and Failure Reproduction
 
 The [pass manager](#pass-manager) in MLIR contains a builtin mechanism to
diff --git a/mlir/include/mlir/Pass/PassManager.h b/mlir/include/mlir/Pass/PassManager.h
index d5f1ea0fe0350dd..1832800c5d137ad 100644
--- a/mlir/include/mlir/Pass/PassManager.h
+++ b/mlir/include/mlir/Pass/PassManager.h
@@ -18,8 +18,8 @@
 #include "llvm/Support/raw_ostream.h"
 
 #include <functional>
-#include <vector>
 #include <optional>
+#include <vector>
 
 namespace mlir {
 class AnalysisManager;
@@ -381,6 +381,42 @@ class PassManager : public OpPassManager {
       bool printAfterOnlyOnFailure = false, raw_ostream &out = llvm::errs(),
       OpPrintingFlags opPrintingFlags = OpPrintingFlags());
 
+  /// Similar to `enableIRPrinting` above, except that instead of printing
+  /// the IR to a single output stream, the instrumentation will print the
+  /// output of each pass to a separate file. The files will be organized into a
+  /// directory tree rooted at `printTreeDir`. The directories mirror the
+  /// nesting structure of the IR. For example, if the IR is congruent to the
+  /// pass-pipeline "builtin.module(pass1,pass2,func.func(pass3,pass4)))", and
+  /// `printTreeDir=/tmp/pipeline_output`, then then the tree file tree created
+  /// will look like:
+  ///
+  /// ```
+  /// /tmp/pass_output
+  /// ├── builtin_module_the_symbol_name
+  /// │   ├── 0_pass1.mlir
+  /// │   ├── 1_pass2.mlir
+  /// │   ├── func_func_my_func_name
+  /// │   │   ├── 2_pass3.mlir
+  /// │   │   ├── 3_pass4.mlir
+  /// │   ├── func_func_my_other_func_name
+  /// │   │   ├── 4_pass3.mlir
+  /// │   │   ├── 5_pass4.mlir
+  /// ```
+  ///
+  /// The subdirectories are given names that reflect the parent operation name
+  /// and symbol name (if present). The output MLIR files are prefixed using an
+  /// atomic counter to indicate the order the passes were printed in and to
+  /// prevent any potential name collisions.
+  void enableIRPrintingToFileTree(
+      std::function<bool(Pass *, Operation *)> shouldPrintBeforePass =
+          [](Pass *, Operation *) { return true; },
+      std::function<bool(Pass *, Operation *)> shouldPrintAfterPass =
+          [](Pass *, Operation *) { return true; },
+      bool printModuleScope = true, bool printAfterOnlyOnChange = true,
+      bool printAfterOnlyOnFailure = false,
+      llvm::StringRef printTreeDir = ".pass_manager_output",
+      OpPrintingFlags opPrintingFlags = OpPrintingFlags());
+
   //===--------------------------------------------------------------------===//
   // Pass Timing
 
diff --git a/mlir/lib/Pass/IRPrinting.cpp b/mlir/lib/Pass/IRPrinting.cpp
index ee52bf81847c232..2878a973110f12a 100644
--- a/mlir/lib/Pass/IRPrinting.cpp
+++ b/mlir/lib/Pass/IRPrinting.cpp
@@ -9,8 +9,12 @@
 #include "PassDetail.h"
 #include "mlir/IR/SymbolTable.h"
 #include "mlir/Pass/PassManager.h"
+#include "mlir/Support/FileUtilities.h"
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/Support/FileSystem.h"
 #include "llvm/Support/Format.h"
 #include "llvm/Support/FormatVariadic.h"
+#include "llvm/Support/ToolOutputFile.h"
 
 using namespace mlir;
 using namespace mlir::detail;
@@ -199,6 +203,133 @@ struct BasicIRPrinterConfig : public PassManager::IRPrinterConfig {
 };
 } // namespace
 
+/// Return pairs of (sanitized op name, symbol name) for `op` and all parent
+/// operations. Op names are sanitized by replacing periods with underscores.
+/// The pairs are returned in order of outer-most to inner-most (ancestors of
+/// `op` first, `op` last). This information is used to construct the directory
+/// tree for the `FileTreeIRPrinterConfig` below.
+static SmallVector<std::pair<std::string, StringRef>>
+getOpAndSymbolNames(Operation *op) {
+  SmallVector<std::pair<std::string, StringRef>> pathElements;
+  while (true) {
+    if (!op)
+      break;
+    StringAttr symbolName =
+        op->getAttrOfType<StringAttr>(SymbolTable::getSymbolAttrName());
+    std::string opName =
+        llvm::join(llvm::split(op->getName().getStringRef().str(), '.'), "_");
+    pathElements.emplace_back(opName, symbolName ? symbolName.strref()
+                                                 : "no-symbol-name");
+    op = op->getParentOp();
+  }
+  // Return in the order of top level (module) down to `op`.
+  std::reverse(pathElements.begin(), pathElements.end());
+  return pathElements;
+}
+
+static LogicalResult createDirectoryOrPrintErr(llvm::StringRef dirPath) {
+  if (std::error_code ec =
+          llvm::sys::fs::create_directory(dirPath, /*IgnoreExisting=*/true)) {
+    llvm::errs() << "Error while creating directory " << dirPath << ": "
+                 << ec.message() << "\n";
+    return failure();
+  }
+  return success();
+}
+
+/// Create the directories and file given by
+/// "[rootDir]/([ancestor_op_name]_[ancestor_symbol]/).../[counter]_[passName].mlir".
+std::unique_ptr<llvm::ToolOutputFile>
+createTreePrinterOutputPath(Operation *op, llvm::StringRef passArgument,
+                            llvm::StringRef rootDir,
+                            std::atomic<unsigned> &counter) {
+  // Create the path. We will create a tree rooted at the given dump
+  // directory. The root directory will contain folders with the names of
+  // modules. Sub-directories within those folders mirror the nesting
+  // structure of the pass manager, using symbol names for directory names.
+  std::string fileName = llvm::formatv("{0}_{1}.mlir", counter++, passArgument);
+  SmallVector<std::pair<std::string, StringRef>> opAndSymbolNames =
+      getOpAndSymbolNames(op);
+
+  // Create all the directories, starting at the root. Abort early if we fail to
+  // create any directory.
+  std::string path = rootDir.str();
+  if (failed(createDirectoryOrPrintErr(path)))
+    return nullptr;
+
+  for (auto [opName, symbolName] : opAndSymbolNames) {
+    path = llvm::join_items("/", path, (opName + "_" + symbolName).str());
+    if (failed(createDirectoryOrPrintErr(path)))
+      return nullptr;
+  }
+
+  // Open file, print, and tell LLVM to keep the file if successful.
+  path = llvm::join_items("/", path, fileName);
+  std::string error;
+  std::unique_ptr<llvm::ToolOutputFile> file = openOutputFile(path, &error);
+  if (!file) {
+    llvm::errs() << "Error opening output file " << path << ": " << error
+                 << "\n";
+    return nullptr;
+  }
+  return file;
+}
+
+namespace {
+/// A configuration that prints the IR before/after each pass to a set of files
+/// in the specified directory. The files are organized into subdirectories that
+/// mirror the nesting structure of the IR.
+struct FileTreeIRPrinterConfig : public PassManager::IRPrinterConfig {
+  FileTreeIRPrinterConfig(
+      std::function<bool(Pass *, Operation *)> shouldPrintBeforePass,
+      std::function<bool(Pass *, Operation *)> shouldPrintAfterPass,
+      bool printModuleScope, bool printAfterOnlyOnChange,
+      bool printAfterOnlyOnFailure, OpPrintingFlags opPrintingFlags,
+      llvm::StringRef treeDir)
+      : IRPrinterConfig(printModuleScope, printAfterOnlyOnChange,
+                        printAfterOnlyOnFailure, opPrintingFlags),
+        shouldPrintBeforePass(std::move(shouldPrintBeforePass)),
+        shouldPrintAfterPass(std::move(shouldPrintAfterPass)),
+        treeDir(treeDir) {
+    assert((this->shouldPrintBeforePass || this->shouldPrintAfterPass) &&
+           "expected at least one valid filter function");
+  }
+
+  void printBeforeIfEnabled(Pass *pass, Operation *operation,
+                            PrintCallbackFn printCallback) final {
+    if (!shouldPrintBeforePass || !shouldPrintBeforePass(pass, operation))
+      return;
+    std::unique_ptr<llvm::ToolOutputFile> file = createTreePrinterOutputPath(
+        operation, pass->getArgument(), treeDir, counter);
+    if (!file)
+      return;
+    printCallback(file->os());
+    file->keep();
+  }
+
+  void printAfterIfEnabled(Pass *pass, Operation *operation,
+                           PrintCallbackFn printCallback) final {
+    if (!shouldPrintAfterPass || !shouldPrintAfterPass(pass, operation))
+      return;
+    std::unique_ptr<llvm::ToolOutputFile> file = createTreePrinterOutputPath(
+        operation, pass->getArgument(), treeDir, counter);
+    if (!file)
+      return;
+    printCallback(file->os());
+    file->keep();
+  }
+
+  /// Filter functions for before and after pass execution.
+  std::function<bool(Pass *, Operation *)> shouldPrintBeforePass;
+  std::function<bool(Pass *, Operation *)> shouldPrintAfterPass;
+
+  /// The stream to output to.
+  std::string treeDir;
+  std::atomic<unsigned> counter = 0;
+};
+
+} // namespace
+
 /// Add an instrumentation to print the IR before and after pass execution,
 /// using the provided configuration.
 void PassManager::enableIRPrinting(std::unique_ptr<IRPrinterConfig> config) {
@@ -222,3 +353,16 @@ void PassManager::enableIRPrinting(
       printModuleScope, printAfterOnlyOnChange, printAfterOnlyOnFailure,
       opPrintingFlags, out));
 }
+
+/// Add an instrumentation to print the IR before and after pass execution.
+void PassManager::enableIRPrintingToFileTree(
+    std::function<bool(Pass *, Operation *)> shouldPrintBeforePass,
+    std::function<bool(Pass *, Operation *)> shouldPrintAfterPass,
+    bool printModuleScope, bool printAfterOnlyOnChange,
+    bool printAfterOnlyOnFailure, StringRef printTreeDir,
+    OpPrintingFlags opPrintingFlags) {
+  enableIRPrinting(std::make_unique<FileTreeIRPrinterConfig>(
+      std::move(shouldPrintBeforePass), std::move(shouldPrintAfterPass),
+      printModuleScope, printAfterOnlyOnChange, printAfterOnlyOnFailure,
+      opPrintingFlags, printTreeDir));
+}
diff --git a/mlir/lib/Pass/PassManagerOptions.cpp b/mlir/lib/Pass/PassManagerOptions.cpp
index ffc53b7e3ed0236..706a21a23ee3e8e 100644
--- a/mlir/lib/Pass/PassManagerOptions.cpp
+++ b/mlir/lib/Pass/PassManagerOptions.cpp
@@ -58,6 +58,10 @@ struct PassManagerOptions {
       llvm::cl::desc("When printing IR for print-ir-[before|after]{-all} "
                      "always print the top-level operation"),
       llvm::cl::init(false)};
+  llvm::cl::opt<std::string> printTreeDir{
+      "mlir-print-ir-tree-dir",
+      llvm::cl::desc("When printing the IR before/after a pass, print file "
+                     "tree rooted at this directory")};
 
   /// Add an IR printing instrumentation if enabled by any 'print-ir' flags.
   void addPrinterInstrumentation(PassManager &pm);
@@ -120,6 +124,13 @@ void PassManagerOptions::addPrinterInstrumentation(PassManager &pm) {
     return;
 
   // Otherwise, add the IR printing instrumentation.
+  if (!printTreeDir.empty()) {
+    pm.enableIRPrintingToFileTree(shouldPrintBeforePass, shouldPrintAfterPass,
+                                  printModuleScope, printAfterChange,
+                                  printAfterFailure, printTreeDir);
+    return;
+  }
+
   pm.enableIRPrinting(shouldPrintBeforePass, shouldPrintAfterPass,
                       printModuleScope, printAfterChange, printAfterFailure,
                       llvm::errs());
diff --git a/mlir/test/Pass/ir-printing-file-tree.mlir b/mlir/test/Pass/ir-printing-file-tree.mlir
new file mode 100644
index 000000000000000..4c53b25d62196c6
--- /dev/null
+++ b/mlir/test/Pass/ir-printing-file-tree.mlir
@@ -0,0 +1,28 @@
+// Test filtering by "before"
+// RUN: rm -rf %t || true
+// RUN: mlir-opt %s -mlir-print-ir-tree-dir=%t \
+// RUN:   -pass-pipeline='builtin.module(builtin.module(func.func(cse,canonicalize)))' \
+// RUN:   -mlir-print-ir-before=cse -mlir-disable-threading
+// RUN: test -f %t/builtin_module_top/builtin_module_middle/func_func_func1/0_cse.mlir
+// RUN: test -f %t/builtin_module_top/builtin_module_middle/func_func_func2/1_cse.mlir
+
+// Test printing after all.
+// RUN: rm -rf %t || true
+// RUN: mlir-opt %s -mlir-print-ir-tree-dir=%t \
+// RUN:   -pass-pipeline='builtin.module(builtin.module(func.func(cse,canonicalize)))' \
+// RUN:   -mlir-print-ir-after-all -mlir-disable-threading
+// RUN: test -f %t/builtin_module_top/builtin_module_middle/func_func_func1/0_cse.mlir
+// RUN: test -f %t/builtin_module_top/builtin_module_middle/func_func_func1/1_canonicalize.mlir
+// RUN: test -f %t/builtin_module_top/builtin_module_middle/func_func_func2/2_cse.mlir
+// RUN: test -f %t/builtin_module_top/builtin_module_middle/func_func_func2/3_canonicalize.mlir
+
+builtin.module @top {
+  builtin.module @middle {
+    func.func @func1() {
+      return
+    }
+    func.func @func2() {
+      return
+    }
+  }
+}