[Mlir-commits] [mlir] Add a tutorial on mlir-opt (PR #96105)

Fri Jul 26 16:14:48 PDT 2024

================
@@ -0,0 +1,432 @@
+# Using `mlir-opt`
+
+`mlir-opt` is the command-line entry point for running passes and lowerings on MLIR code.
+This tutorial will explain how to use `mlir-opt` to run passes, and explain
+some details about MLIR's built-in dialects along the way.
+
+Prerequisites:
+
+- [Building MLIR from source](/getting_started/)
+
+[TOC]
+
+## Overview
+
+We start with a brief summary of context that helps to frame
+the uses of `mlir-opt` detailed in this article.
+For a deeper dive on motivation and design,
+see [the MLIR paper](https://arxiv.org/abs/2002.11054).
+
+Two of the central concepts in MLIR are *dialects* and *lowerings*.
+In traditional compilers, there is typically one "dialect,"
+called an *intermediate representation*, or IR,
+that is the textual or data-structural description of a program
+within the scope of the compiler's execution.
+For example, in GCC the IR is called GIMPLE,
+and in LLVM it's called LLVM-IR.
+Compilers typically convert an input program to their IR,
+run optimization passes,
+and then convert the optimized IR to machine code.
+
+MLIR's philosophy is to split the job into smaller steps.
+First, MLIR allows one to define many IRs called *dialects*,
+some considered "high level" and some "low level,"
+but each with a set of types, operations, metadata,
+and semantics that defines what the operations do.
+Different dialects may coexist in the same program.
+Then, one writes a set of *lowering passes*
+that incrementally converts different parts of the program
+from higher level dialects to lower and lower dialects
+until you get to machine code
+(or, in many cases, LLVM, which finishes the job).
+Along the way,
+*optimizing passes* are run to make the code more efficient.
+The main point here is that the high level dialects exist
+*so that* they make it easy to write these important optimizing passes.
+
+A central motivation for building MLIR
+was to build the `affine` dialect,
+which is designed to enable [polyhedral optimizations](https://polyhedral.info/)
+for loop transformations.
+Compiler engineers had previously implemented polyhedral optimizations
+in LLVM and GCC (without an `affine` dialect),
+and it was difficult because they had to reconstruct well-structured loop nests
+from a much more complicated set of low-level operations.
+Having a higher level `affine` dialect preserves the loop nest structure
+at an abstraction layer that makes it easier to write optimizations,
+and then discards it during lowering passes.
+
+The `mlir-opt` tool can run both
+optimization passes and lowerings,
+though the final code generation
+is performed by a different tool called `mlir-translate`.
+In particular, `mlir-opt` consumes MLIR as input and produce MLIR as output,
+while `mlir-translate` consumes MLIR as input
+and produces non-MLIR program representations as output.
+
+## Two example programs
+
+Here are two MLIR programs.
+The first defines a function that counts the leading zeroes of a 32-bit integer (`i32`)
+using the [`math` dialect's](/docs/Dialects/MathOps/) `ctlz` operation.
+
+```mlir
+// mlir/test/Examples/mlir-opt/ctlz.mlir
+func.func @main(%arg0: i32) -> i32 {
+  %0 = math.ctlz %arg0 : i32
+  func.return %0 : i32
+}
+```
+
+This shows the basic structure of an MLIR operation
+([see here](https://mlir.llvm.org/docs/LangRef/#operations) for a more complete spec).
+Variable names are prefixed with `%`,
+functions by `@`,
+and each variable/value in a program has a type,
+often expressed after a colon.
+In this case all the types are `i32`,
+except for the function type which is `(i32) -> i32`
+(not specified explicitly above, but you'll see it in the `func.call` later).
+
+Each statement is anchored around an expression like `math.ctlz`
+which specifies the dialect [`math`](https://mlir.llvm.org/docs/Dialects/MathOps/) via a namespace,
+and the operation [`ctlz`](https://mlir.llvm.org/docs/Dialects/MathOps/#mathctlz-mathcountleadingzerosop) after the `.`.
+The rest of the syntax of the operation
+is determined by a parser defined by the dialect,
+and so many operations will have different syntaxes.
+In the case of `math.ctlz`,
+the sole argument is an integer whose leading zeros are to be counted,
+and the trailing ` : i32` denotes the output type storing the count.
+
+It's important to note that [`func`](https://mlir.llvm.org/docs/Dialects/Func/) is itself a dialect,
+and [`func.func`](https://mlir.llvm.org/docs/Dialects/Func/#funcfunc-funcfuncop) is an operation,
+where the braces and the function's body is part of the syntax.
+In MLIR a list of operations within braces is called a [*region*](https://mlir.llvm.org/docs/LangRef/#regions),
+and an operation can have zero regions like `math.ctlz`,
+one region like `func.func`,
+or multiple regions like [`scf.if`](https://mlir.llvm.org/docs/Dialects/SCFDialect/#scfif-scfifop),
+which has a region for each of its two control flow branches.
+
+The second program is a sequence of loops
+that exhibits poor cache locality.
+
+```mlir
+// mlir/test/Examples/mlir-opt/loop_fusion.mlir
+func.func @producer_consumer_fusion(%arg0: memref<10xf32>, %arg1: memref<10xf32>) {
+  %0 = memref.alloc() : memref<10xf32>
+  %1 = memref.alloc() : memref<10xf32>
+  %cst = arith.constant 0.000000e+00 : f32
+  affine.for %arg2 = 0 to 10 {
+    affine.store %cst, %0[%arg2] : memref<10xf32>
+    affine.store %cst, %1[%arg2] : memref<10xf32>
+  }
+  affine.for %arg2 = 0 to 10 {
+    %2 = affine.load %0[%arg2] : memref<10xf32>
+    %3 = arith.addf %2, %2 : f32
+    affine.store %3, %arg0[%arg2] : memref<10xf32>
+  }
+  affine.for %arg2 = 0 to 10 {
+    %2 = affine.load %1[%arg2] : memref<10xf32>
+    %3 = arith.mulf %2, %2 : f32
+    affine.store %3, %arg1[%arg2] : memref<10xf32>
+  }
+  return
+}
+```
+
+This program introduces some additional dialects.
+The [`affine` dialect](https://mlir.llvm.org/docs/Dialects/Affine/) mentioned in the introduction
+represents well-structured loop nests,
+and the [`affine.for` operation](https://mlir.llvm.org/docs/Dialects/Affine/#affinefor-affineaffineforop)
+whose region corresponds to the loop's body.
+`affine.for` also showcases some custom-defined syntax
+to represent the loop bounds and loop induction variable.
+The [`memref` dialect](https://mlir.llvm.org/docs/Dialects/MemRef/)
+defines types and operations related to memory management
+with pointer semantics.
+Note also that while `memref` has store and load operations,
+`affine` has its own that limit what types of memory accesses are allowed,
+so as to ensure the well-structuredness of the loop nest.
+
+## Running `mlir-opt`
+
+After building the MLIR project,
+the `mlir-opt` binary (located in `build/bin`)
+is the entry point for running passes and lowerings,
+as well as emitting debug and diagnostic data.
+
+Running `mlir-opt` with no flags will consume MLIR input
+from standard in, parse and run verifiers on it,
+and write the MLIR back to standard out.
+This is a good way to test if an input MLIR is well-formed.
+
+`mlir-opt --help` shows a complete list of flags
+(there are nearly 1000).
+Each pass gets its own flag.
+
+## Lowering `ctlz`
+
+Next we will show two of MLIR's lowering passes.
+The first, `convert-math-to-llvm`, converts the `ctlz` op
+to the `llvm` dialect's [`intr.ctlz` op](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrctlz-llvmcountleadingzerosop)
+which is an LLVM intrinsic.
+Note that `llvm` here is MLIR's `llvm` dialect,
+which would still need to be processed through `mlir-translate`
+to generate LLVM-IR.
+
+Recall our ctlz program:
+
+```mlir
+// mlir/test/Examples/mlir-opt/ctlz.mlir
+func.func @main(%arg0: i32) -> i32 {
+  %0 = math.ctlz %arg0 : i32
+  func.return %0 : i32
+}
+```
+
+After building MLIR, and from the `llvm-project` base directory, run
+
+```bash
+build/bin/mlir-opt --convert-math-to-llvm mlir/test/Examples/mlir-opt/ctlz.mlir
----------------
joker-eph wrote:

> Perhaps you should consider updating mlir-opt to emit a warning or hard error when one tries to use a non-builtin.module-anchored pass outside of --pass-pipeline.

Unfortunately I think we have a default nesting behavior that makes it implicitly work on immediately nested IR, that is if you have a `FuncOp` pass and use the short syntax it'll translate to `--pass-pipeline="builtin.module(func.func(passName))`
(this behavior is coming from our initial implementation of MLIR which had hard-coded module/func structure without any more nesting: there was never any ambiguity at the time).

https://github.com/llvm/llvm-project/pull/96105