[Mlir-commits] [mlir] Add a tutorial on mlir-opt (PR #96105)

Thu Jun 20 05:39:43 PDT 2024

================
@@ -0,0 +1,432 @@
+# Using `mlir-opt`
+
+`mlir-opt` is the command-line entry point for running passes and lowerings on MLIR code.
+This tutorial will explain how to use `mlir-opt` to run passes, and explain
+some details about MLIR's built-in dialects along the way.
+
+Prerequisites:
+
+- [Building MLIR from source](/getting_started/)
+
+[TOC]
+
+## Overview
+
+We start with a brief summary of context that helps to frame
+the uses of `mlir-opt` detailed in this article.
+For a deeper dive on motivation and design,
+see [the MLIR paper](https://arxiv.org/abs/2002.11054).
+
+Two of the central concepts in MLIR are *dialects* and *lowerings*.
+In traditional compilers, there is typically one "dialect,"
+called an *intermediate representation*, or IR,
+that is the textual or data-structural description of a program
+within the scope of the compiler's execution.
+For example, in GCC the IR is called GIMPLE,
+and in LLVM it's called LLVM-IR.
+Compilers typically convert an input program to their IR,
+run optimization passes,
+and then convert the optimized IR to machine code.
+
+MLIR's philosophy is to split the job into smaller steps.
+First, MLIR allows one to define many IRs called *dialects*,
+some considered "high level" and some "low level,"
+but each with a set of types, operations, metadata,
+and semantics that defines what the operations do.
+Different dialects may coexist in the same program.
+Then, one writes a set of *lowering passes*
+that incrementally converts different parts of the program
+from higher level dialects to lower and lower dialects
+until you get to machine code
+(or, in many cases, LLVM, which finishes the job).
+Along the way,
+*optimizing passes* are run to make the code more efficient.
+The main point here is that the high level dialects exist
+*so that* they make it easy to write these important optimizing passes.
+
+A central motivation for building MLIR
+was to build the `affine` dialect,
+which is designed to enable [polyhedral optimizations](https://polyhedral.info/)
+for loop transformations.
+Compiler engineers had previously implemented polyhedral optimizations
+in LLVM and GCC (without an `affine` dialect),
+and it was difficult because they had to reconstruct well-structured loop nests
+from a much more complicated set of low-level operations.
+Having a higher level `affine` dialect preserves the loop nest structure
+at an abstraction layer that makes it easier to write optimizations,
+and then discards it during lowering passes.
+
+The `mlir-opt` tool can run both
+optimization passes and lowerings,
+though the final code generation
+is performed by a different tool called `mlir-translate`.
+In particular, `mlir-opt` consumes MLIR as input and produce MLIR as output,
+while `mlir-translate` consumes MLIR as input
+and produces non-MLIR program representations as output.
+
+## Two example programs
+
+Here are two MLIR programs.
+The first defines a function that counts the leading zeroes of a 32-bit integer (`i32`)
+using the [`math` dialect's](/docs/Dialects/MathOps/) `ctlz` operation.
+
+```mlir
+// mlir/test/Examples/mlir-opt/ctlz.mlir
+func.func @main(%arg0: i32) -> i32 {
+  %0 = math.ctlz %arg0 : i32
+  func.return %0 : i32
+}
+```
+
+This shows the basic structure of an MLIR operation
+([see here](https://mlir.llvm.org/docs/LangRef/#operations) for a more complete spec).
+Variable names are prefixed with `%`,
+functions by `@`,
+and each variable/value in a program has a type,
+often expressed after a colon.
+In this case all the types are `i32`,
+except for the function type which is `(i32) -> i32`
+(not specified explicitly above, but you'll see it in the `func.call` later).
+
+Each statement is anchored around an expression like `math.ctlz`
+which specifies the dialect [`math`](https://mlir.llvm.org/docs/Dialects/MathOps/) via a namespace,
+and the operation [`ctlz`](https://mlir.llvm.org/docs/Dialects/MathOps/#mathctlz-mathcountleadingzerosop) after the `.`.
+The rest of the syntax of the operation
+is determined by a parser defined by the dialect,
+and so many operations will have different syntaxes.
+In the case of `math.ctlz`,
+the sole argument is an integer whose leading zeros are to be counted,
+and the trailing ` : i32` denotes the output type storing the count.
+
+It's important to note that [`func`](https://mlir.llvm.org/docs/Dialects/Func/) is itself a dialect,
+and [`func.func`](https://mlir.llvm.org/docs/Dialects/Func/#funcfunc-funcfuncop) is an operation,
+where the braces and the function's body is part of the syntax.
+In MLIR a list of operations within braces is called a [*region*](https://mlir.llvm.org/docs/LangRef/#regions),
+and an operation can have zero regions like `math.ctlz`,
+one region like `func.func`,
+or multiple regions like [`scf.if`](https://mlir.llvm.org/docs/Dialects/SCFDialect/#scfif-scfifop),
+which has a region for each of its two control flow branches.
+
+The second program is a sequence of loops
+that exhibits poor cache locality.
+
+```mlir
+// mlir/test/Examples/mlir-opt/loop_fusion.mlir
+func.func @producer_consumer_fusion(%arg0: memref<10xf32>, %arg1: memref<10xf32>) {
+  %0 = memref.alloc() : memref<10xf32>
+  %1 = memref.alloc() : memref<10xf32>
+  %cst = arith.constant 0.000000e+00 : f32
+  affine.for %arg2 = 0 to 10 {
+    affine.store %cst, %0[%arg2] : memref<10xf32>
+    affine.store %cst, %1[%arg2] : memref<10xf32>
+  }
+  affine.for %arg2 = 0 to 10 {
+    %2 = affine.load %0[%arg2] : memref<10xf32>
+    %3 = arith.addf %2, %2 : f32
+    affine.store %3, %arg0[%arg2] : memref<10xf32>
+  }
+  affine.for %arg2 = 0 to 10 {
+    %2 = affine.load %1[%arg2] : memref<10xf32>
+    %3 = arith.mulf %2, %2 : f32
+    affine.store %3, %arg1[%arg2] : memref<10xf32>
+  }
+  return
+}
+```
+
+This program introduces some additional dialects.
+The [`affine` dialect](https://mlir.llvm.org/docs/Dialects/Affine/) mentioned in the introduction
+represents well-structured loop nests,
+and the [`affine.for` operation](https://mlir.llvm.org/docs/Dialects/Affine/#affinefor-affineaffineforop)
+whose region corresponds to the loop's body.
+`affine.for` also showcases some custom-defined syntax
+to represent the loop bounds and loop induction variable.
+The [`memref` dialect](https://mlir.llvm.org/docs/Dialects/MemRef/)
+defines types and operations related to memory management
+with pointer semantics.
+Note also that while `memref` has store and load operations,
+`affine` has its own that limit what types of memory accesses are allowed,
+so as to ensure the well-structuredness of the loop nest.
+
+## Running `mlir-opt`
+
+After building the MLIR project,
+the `mlir-opt` binary (located in `build/bin`)
+is the entry point for running passes and lowerings,
+as well as emitting debug and diagnostic data.
+
+Running `mlir-opt` with no flags will consume MLIR input
+from standard in, parse and run verifiers on it,
+and write the MLIR back to standard out.
+This is a good way to test if an input MLIR is well-formed.
+
+`mlir-opt --help` shows a complete list of flags
+(there are nearly 1000).
+Each pass gets its own flag.
+
+## Lowering `ctlz`
+
+Next we will show two of MLIR's lowering passes.
+The first, `convert-math-to-llvm`, converts the `ctlz` op
+to the `llvm` dialect's [`intr.ctlz` op](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrctlz-llvmcountleadingzerosop)
+which is an LLVM intrinsic.
+Note that `llvm` here is MLIR's `llvm` dialect,
+which would still need to be processed through `mlir-translate`
+to generate LLVM-IR.
+
+Recall our ctlz program:
+
+```mlir
+// mlir/test/Examples/mlir-opt/ctlz.mlir
+func.func @main(%arg0: i32) -> i32 {
+  %0 = math.ctlz %arg0 : i32
+  func.return %0 : i32
+}
+```
+
+After building MLIR, and from the `llvm-project` base directory, run
+
+```bash
+build/bin/mlir-opt --convert-math-to-llvm mlir/test/Examples/mlir-opt/ctlz.mlir
+```
+
+which produces
+
+```mlir
+module {
+  func.func @main(%arg0: i32) -> i32 {
+    %0 = "llvm.intr.ctlz"(%arg0) <{is_zero_poison = false}> : (i32) -> i32
+    return %0 : i32
+  }
+}
+```
+
+As you can see, the `math` dialect was the only thing that changed
+due to the lowering.
+The `func` dialect was left alone,
+even though it also needs to be converted to the `llvm` dialect
+to generate LLVM-IR.
----------------
joker-eph wrote:

Why is it more advanced to explain? It's just a pass like another one in this context. For example you don't feel like explaining the "dialect conversion framework" because it is used by the pass you're using here.

We get into weird details here about one dialect being converted but not another one, and it just seems confusing for no reason here.

https://github.com/llvm/llvm-project/pull/96105