[Mlir-commits] [mlir] [mlir] update transform dialect tutorials (PR #81199)

llvmlistbot at llvm.org llvmlistbot at llvm.org
Thu Feb 8 14:23:53 PST 2024


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-mlir

Author: Oleksandr "Alex" Zinenko (ftynse)

<details>
<summary>Changes</summary>

Use the "main" transform-interpreter pass instead of the test pass. This, along with the previously introduced debug extension, now allow tutorials to no longer depend on test passes and extensions.

---

Patch is 98.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/81199.diff


22 Files Affected:

- (modified) mlir/docs/Tutorials/transform/Ch1.md (+182-165) 
- (modified) mlir/docs/Tutorials/transform/Ch2.md (+109-93) 
- (modified) mlir/docs/Tutorials/transform/Ch3.md (+6-6) 
- (modified) mlir/docs/Tutorials/transform/Ch4.md (+1-1) 
- (modified) mlir/examples/transform/Ch2/transform-opt/transform-opt.cpp (+4-18) 
- (modified) mlir/examples/transform/Ch3/transform-opt/transform-opt.cpp (+4-22) 
- (modified) mlir/examples/transform/Ch4/transform-opt/transform-opt.cpp (-12) 
- (modified) mlir/include/mlir/Dialect/Transform/Transforms/Passes.td (+4) 
- (modified) mlir/include/mlir/Dialect/Transform/Transforms/TransformInterpreterUtils.h (+5) 
- (modified) mlir/include/mlir/Dialect/Transform/Utils/RaggedArray.h (+3) 
- (modified) mlir/lib/Dialect/Transform/Transforms/InterpreterPass.cpp (+23-1) 
- (modified) mlir/lib/Dialect/Transform/Transforms/TransformInterpreterUtils.cpp (+30-6) 
- (modified) mlir/test/Examples/transform/Ch1/invalidation-1.mlir (+39-36) 
- (modified) mlir/test/Examples/transform/Ch1/invalidation-2.mlir (+9-9) 
- (modified) mlir/test/Examples/transform/Ch1/sequence.mlir (+53-52) 
- (modified) mlir/test/Examples/transform/Ch2/invalid.mlir (+5-5) 
- (modified) mlir/test/Examples/transform/Ch2/ops.mlir (+8-7) 
- (modified) mlir/test/Examples/transform/Ch2/sequence.mlir (+50-49) 
- (modified) mlir/test/Examples/transform/Ch3/invalid.mlir (+5-5) 
- (modified) mlir/test/Examples/transform/Ch3/ops.mlir (+15-13) 
- (modified) mlir/test/Examples/transform/Ch3/sequence.mlir (+57-56) 
- (modified) mlir/test/Examples/transform/ChH/full.mlir (+3-3) 


``````````diff
diff --git a/mlir/docs/Tutorials/transform/Ch1.md b/mlir/docs/Tutorials/transform/Ch1.md
index 7a299a48600b8..b0fdf085854c7 100644
--- a/mlir/docs/Tutorials/transform/Ch1.md
+++ b/mlir/docs/Tutorials/transform/Ch1.md
@@ -6,7 +6,7 @@ The Transform dialect allows one to precisely target transformations at specific
 
 Transform IR operations operate on values that may be associated with payload IR operations, values or attributes. We call the first two kinds of values operation and value handles, respectively. We call the last kind of values parameters.
 
-The application of transform IR always starts from one top-level operation. In the C++ API, this operation is passed to the `applyTransforms` function. This top-level operation specifies if other transformations should be performed and how. The most common top-level operation merely applies other transform operations listed in its body one after the other.
+The application of transform IR always starts from one top-level operation. In the C++ API, this operation is passed to the `applyTransforms` function. This top-level operation specifies if other transformations should be performed and how. The most common top-level operation, `transform.named_sequence` merely applies other transform operations listed in its body one after the other, similarly to a function or a macro.
 
 Let us illustrate this with a simple sequence of transformations on the common “fully connected + bias + ReLU” ML layer, which boils down to performing a matrix multiplication, followed by an (elementwise) matrix addition and taking an elementwise maximum with 0. This can be expressed using the following IR:
 
@@ -14,7 +14,7 @@ Let us illustrate this with a simple sequence of transformations on the common 
 func.func @fc_relu(%lhs: tensor<512x512xf32>, %rhs: tensor<512x512xf32>,
                    %bias: tensor<512x512xf32>, %output: tensor<512x512xf32>)
                    -> tensor<512x512xf32> {
-  // Matrix-matrix multiplication.  
+  // Matrix-matrix multiplication.
   %matmul = linalg.matmul ins(%lhs, %rhs: tensor<512x512xf32>, tensor<512x512xf32>)
                           outs(%output: tensor<512x512xf32>) -> tensor<512x512xf32>
 
@@ -22,7 +22,7 @@ func.func @fc_relu(%lhs: tensor<512x512xf32>, %rhs: tensor<512x512xf32>,
   %biased = linalg.elemwise_binary { fun = #linalg.binary_fn<add> }
     ins(%matmul, %bias : tensor<512x512xf32>, tensor<512x512xf32>)
     outs(%output : tensor<512x512xf32>) -> tensor<512x512xf32>
-  
+
   // Elementwise max with 0 (ReLU).
   %c0f = arith.constant 0.0 : f32
   %relued = linalg.elemwise_binary { fun = #linalg.binary_fn<max_signed> }
@@ -37,30 +37,34 @@ func.func @fc_relu(%lhs: tensor<512x512xf32>, %rhs: tensor<512x512xf32>,
 For performance reasons, we would like to tile and fuse these operations to exploit cache locality. This is a sequence of transformations that need to be performed one after another, so we naturally start with the corresponding top-level transform operation.
 
 ```mlir
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  transform.yield
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+      %arg0: !transform.any_op,
+      %arg1: !transform.op<"linalg.matmul">,
+      %arg2: !transform.op<"linalg.elemwise_binary">):
+    transform.yield
+  }
 }
 ```
 
 There are several aspects worth noticing in this operation.
 
-The first entry block argument is mandatory for top-level transform operations and is associated with the top-level payload operation that sequence is applied to, for example, a module or a function. This operation is specified when calling `applyTransforms`.
+Its special name, `@__transform_main` and the first argument are mandated by the interpreter pass, similarly to how the entry point of C programs needs to be called `main` and may have the `int (int argc, char** argv)` signature. This argument will be associated with the top-level payload operation, most often the operation that the pass is applied to. Note that none of this is required when applying the transformation _programmatically_ via `applyTransforms` or `applyNamedSequence`.
 
 The remaining entry block arguments are optional and can be associated with payload attributes, operations or values that are useful in the sequence. These are also specified when calling `applyTransforms`. In our case, we are interested in the matrix multiplication and elementwise operations that we are going to tile and fuse.
 
 All value handles have Transform dialect types. These types specify certain properties of the payload IR entities associated with them. In this example, `transform.any_op` indicates that the handle is associated with arbitrary payload operations. On the contrary, `transform.op<"X">` indicates that the handle is associated _only_ with payload operations of kind `X`. These constraints are verified when the handle/payload association is created. For entry block arguments of top-level transform operations, this happens early in the `applyTransforms` function. If the constraints are not satisfied, the transform application fails and produces diagnostics for the user.
 
+Finally, the operation is wrapped in a module with the `transform.with_named_sequence` attribute that triggers all necessary verifications if multiple named sequences exist.
+
 ## Failure Propagation
 
-Speaking about diagnostics, the `sequence` operation itself has a mandatory attribute specifying the failure propagation mode. There are two options:
+The Transform dialect infrastructure has a particular mechanism for handling diagnostics that supports recoverable errors. It is best understood by considering the (unnamed) sequence operation that has a mandatory attribute specifying the failure propagation mode. There are two options:
 
 *   “propagate” makes the sequence transformation fail if any of the nested transformation fails;
 *   “suppress” makes the sequence succeed even if one of the nested transformations fails, but without attempting to perform the transformations following the failed one in the sequence.
 
-This latter allows the transformation to continue despite (recoverable) errors. As we are only building the transformation, it is preferable to propagate failures so we know when something did not apply.
+This latter allows the transformation script surrounding the sequence to continue despite errors within the sequence, assuming they are recoverable. As we are only building the transformation script, it is preferable to propagate failures so we know when something did not apply.
 
 To check or debug a transform sequence, it is possible to print various entities associated with the transform IR values. For example, we can print the operations associated with the handles:
 
@@ -83,27 +87,26 @@ Since we don’t want to recompile the compiler every time we change a transform
 
 
 ```sh
-$ mlir-opt matmul.mlir --pass-pipeline="
-    builtin.module(test-transform-dialect-interpreter{
-        bind-first-extra-to-ops=linalg.matmul
-        bind-second-extra-to-ops=linalg.elemwise_binary})"
+$ mlir-opt sequence.mlir --pass-pipeline="
+    builtin.module(transform-interpreter{
+        debug-bind-trailing-args=linalg.matmul,linalg.elemwise_binary})"
 ```
 
-The `matmul.mlir` file contains _both_ the payload IR function _and_ the transform IR sequence nested in the same module. The transform interpreter will find the first top-level transform operation in the root operation of the pass (the module in our case) and apply it to that root operation. In our case, we also asked the interpreter pass to associate the two extra arguments of the top-level sequence with all `linalg.matmul` and `linalg.elemwise_binary` payload operations through the respective pass options. Running this pass results in the expected remarks:
+The `sequence.mlir` file contains _both_ the payload IR function _and_ the transform IR sequence nested in the same module. The transform interpreter pass will apply the `@__transform_main` named sequence to the anchor operation of the pass. In our case, we also asked the interpreter pass to associate the two extra arguments of the top-level sequence with all `linalg.matmul` and `linalg.elemwise_binary` payload operations through the respective pass options. Running this pass results in the expected remarks:
 
 ```sh
-matmul.mlir:7:13: remark: matmul
+sequence.mlir:7:13: remark: matmul
   %matmul = linalg.matmul ins(%lhs, %rhs: tensor<512x512xf32>, tensor<512x512xf32>)
             ^
-matmul.mlir:7:13: note: see current operation: %0 = linalg.matmul ins(%arg0, %arg1 : tensor<512x512xf32>, tensor<512x512xf32>) outs(%arg3 : tensor<512x512xf32>) -> tensor<512x512xf32>
-matmul.mlir:10:13: remark: elemwise_binaries
+sequence.mlir:7:13: note: see current operation: %0 = linalg.matmul ins(%arg0, %arg1 : tensor<512x512xf32>, tensor<512x512xf32>) outs(%arg3 : tensor<512x512xf32>) -> tensor<512x512xf32>
+sequence.mlir:10:13: remark: elemwise_binaries
   %biased = linalg.elemwise_binary { fun = #linalg.binary_fn<add> }
             ^
-matmul.mlir:10:13: note: see current operation: %1 = linalg.elemwise_binary {fun = #linalg.binary_fn<add>} ins(%0, %arg2 : tensor<512x512xf32>, tensor<512x512xf32>) outs(%arg3 : tensor<512x512xf32>) -> tensor<512x512xf32>
-matmul.mlir:14:13: remark: elemwise_binaries
+sequence.mlir:10:13: note: see current operation: %1 = linalg.elemwise_binary {fun = #linalg.binary_fn<add>} ins(%0, %arg2 : tensor<512x512xf32>, tensor<512x512xf32>) outs(%arg3 : tensor<512x512xf32>) -> tensor<512x512xf32>
+sequence.mlir:14:13: remark: elemwise_binaries
   %relued = linalg.elemwise_binary { fun = #linalg.binary_fn<max_signed> }
             ^
-matmul.mlir:14:13: note: see current operation: %2 = linalg.elemwise_binary {fun = #linalg.binary_fn<max_signed>} ins(%1, %cst : tensor<512x512xf32>, f32) outs(%arg3 : tensor<512x512xf32>) -> tensor<512x512xf32>
+sequence.mlir:14:13: note: see current operation: %2 = linalg.elemwise_binary {fun = #linalg.binary_fn<max_signed>} ins(%1, %cst : tensor<512x512xf32>, f32) outs(%arg3 : tensor<512x512xf32>) -> tensor<512x512xf32>
 ```
 
 Note that `%arg2` is associated with both elementwise payload operations. Any handle is associated with a list of entities. Individual transformations may or may not care about the order of elements in that list.
@@ -114,26 +117,33 @@ Note that `%arg2` is associated with both elementwise payload operations. Any ha
 Now that we have handles to the operations we want to transform, we are ready to apply the transformations. Let us first try tiling the matmul operation itself.
 
 ```mlir
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  // The actual tiling transformation takes tile sizes as attributes.
-  %loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
-    : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
-  transform.yield
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+       %arg0: !transform.any_op,
+       %arg1: !transform.op<"linalg.matmul">,
+       %arg2: !transform.op<"linalg.elemwise_binary">) {
+    // The actual tiling transformation takes tile sizes as attributes.
+    %loop, %tiled = transform.structured.tile_using_forall %arg1
+                    tile_sizes [4, 32]
+      : (!transform.op<"linalg.matmul">)
+     -> (!transform.any_op, !transform.any_op)
+    transform.yield
+  }
 }
 ```
 
-The transformation returns two handles, as indicated in its [documentation](https://mlir.llvm.org/docs/Dialects/Transform/#transformstructuredtile_using_forall-transformtiletoforallop):
+The transformation returns two handles, as indicated in its [documentation](https://mlir.llvm.org/docs/Dialects/Transform/#transformstructuredtile_using_forall-transformtileusingforallop):
 
-*   A handle to the `scf.forall` “multi-for” loop around tensors.
 *   A handle to `linalg.generic` operating on the subset of the original data.
+*   A handle to the `scf.forall` “multi-for” loop around tensors.
 
 Running this transformation with the same command as above expectedly produces the tiled code.
 
 ```mlir
-func.func @fc_relu(%arg0: tensor<512x512xf32>, %arg1: tensor<512x512xf32>, %arg2: tensor<512x512xf32>, %arg3: tensor<512x512xf32>) -> tensor<512x512xf32> {
+func.func @fc_relu(%arg0: tensor<512x512xf32>,
+                   %arg1: tensor<512x512xf32>,
+                   %arg2: tensor<512x512xf32>,
+                   %arg3: tensor<512x512xf32>) -> tensor<512x512xf32> {
   %cst = arith.constant 0.000000e+00 : f32
   %0 = scf.forall (%arg4, %arg5) in (128, 16) shared_outs(%arg6 = %arg3) -> (tensor<512x512xf32>) {
     %3 = affine.apply affine_map<(d0) -> (d0 * 4)>(%arg4)
@@ -144,7 +154,7 @@ func.func @fc_relu(%arg0: tensor<512x512xf32>, %arg1: tensor<512x512xf32>, %arg2
                        : tensor<512x512xf32> to tensor<512x32xf32>
     %extracted_slice_1 = tensor.extract_slice %arg6[%3, %4] [4, 32] [1, 1]
                       : tensor<512x512xf32> to tensor<4x32xf32>
-    %5 = linalg.matmul 
+    %5 = linalg.matmul
          ins(%extracted_slice, %extracted_slice_0
              : tensor<4x512xf32>, tensor<512x32xf32>)
          outs(%extracted_slice_1 : tensor<4x32xf32>) -> tensor<4x32xf32>
@@ -168,78 +178,79 @@ Besides producing new handles, the tiling transform operation _consumes_ the ope
 
 ## Handle Invalidation and Expensive Checks Mode
 
-Undefined behavior is difficult to grapple with when it does happen, so the Transform dialect interpreter provides a set of additional expensive checks that detect most undefined behavior in the transform IR. For example, if we wanted to  use the `%arg1` handle after it is consumed, it would cause undefined behavior that manifests as an assertion in the debug build, and likely as a segmentation fault in the release mode.
+Undefined behavior is difficult to grapple with when it does happen, so the Transform dialect interpreter defaults to performing a set of additional, potentially expensive, checks that detect most undefined behavior in the transform IR. For example, if we wanted to  use the `%arg1` handle after it is consumed, it would cause undefined behavior that manifests as an assertion in the debug build, and likely as a segmentation fault in the release mode.
 
 ```mlir
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  // The actual tiling transformation takes tile sizes as attributes.
-  %loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
-      : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
-
-  // This is trying to use an invalidated handle leading to undefined behavior.
-  transform.debug.emit_remark_at %arg1, "remark" : !transform.op<"linalg.matmul">
-  transform.yield
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+       %arg0: !transform.any_op,
+       %arg1: !transform.op<"linalg.matmul">,
+       %arg2: !transform.op<"linalg.elemwise_binary">) {
+    // The actual tiling transformation takes tile sizes as attributes.
+    %loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
+        : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
+
+    // This is trying to use an invalidated handle leading to undefined behavior.
+    transform.debug.emit_remark_at %arg1, "remark" : !transform.op<"linalg.matmul">
+    transform.yield
+  }
 }
 ```
 
 However, with the expensive checks enabled in the interpreter, a nice diagnostic is produced:
 
 ```sh
-$ mlir-opt matmul.mlir --pass-pipeline="
-    builtin.module(test-transform-dialect-interpreter{
-        bind-first-extra-to-ops=linalg.matmul
-        bind-second-extra-to-ops=linalg.elemwise_binary
-        enable-expensive-checks})"
-```
-
-```sh
-matmul.mlir:28:3: error: op uses a handle invalidated by a previously executed transform op
+sequence.mlir:28:3: error: op uses a handle invalidated by a previously executed transform op
   transform.debug.emit_remark_at %mm, "elemwise_binaries" : !transform.any_op
   ^
-matmul.mlir:26:9: note: handle to invalidated ops
+sequence.mlir:26:9: note: handle to invalidated ops
   %mm = transform.cast %matmul : !transform.op<"linalg.matmul"> to !transform.any_op
         ^
-matmul.mlir:27:19: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
+sequence.mlir:27:19: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
   %loop, %tiled = transform.structured.tile_using_forall %mm tile_sizes [4, 32]
 ```
 
-One may observe that some operations such as `transform.cast` do not consume the operand (because they don’t erase the corresponding operation). So what would happen if we tried to use that operand instead? 
+When compile-time performance is a concern, and the transformation sequence is sufficiently stable, it is possible to disable expensive checks in the interpreter for improved performance by providing the `disable-expensive-checks` option to the pass or by setting the corresponding flag in the `TransformOptions` passed into `applyTransforms`.
+
+One may observe that some operations such as `transform.cast` do not consume the operand (because they don’t erase the corresponding operation). So what would happen if we tried to use that operand instead?
 
 ```mlir
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  // We can cast one type to another as long as operations are compatible
-  // with both types. This creates "aliasing" handles.
-  %casted = transform.cast %arg1 : !transform.op<"linalg.matmul">
-      to !transform.any_op
-
-  // The actual tiling transformation takes tile sizes as attributes.
-  %loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
-    : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
-
-  // Consuming an operand invalidates the consumed handle and any other handle that is
-  // associated with the same payload operations, or payload operations nested in them.
-  transform.debug.emit_remark_at %casted, "remark"
-    : !transform.any_op
-  transform.yield
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main
+       %arg0: !transform.any_op,
+       %arg1: !transform.op<"linalg.matmul">,
+       %arg2: !transform.op<"linalg.elemwise_binary">) {
+    // We can cast one type to another as long as operations are compatible
+    // with both types. This creates "aliasing" handles.
+    %casted = transform.cast %arg1 : !transform.op<"linalg.matmul">
+        to !transform.any_op
+
+    // The actual tiling transformation takes tile sizes as attributes.
+    %loop, %tiled = transform.structured.tile_using_forall %arg1
+                    tile_sizes [4, 32]
+      : (!transform.op<"linalg.matmul">)
+     -> (!transform.any_op, !transform.any_op)
+
+    // Consuming an operand invalidates the consumed handle and any other handle
+    // that is associated with the same payload operations, or payload
+    // operations nested in them.
+    transform.debug.emit_remark_at %casted, "remark"
+      : !transform.any_op
+    transform.yield
+  }
 }
 ```
 
 Both `%arg1` and `%casted` reference the same payload operation. Extending the reference analogy, these references alias. Naturally, when the payload operation is erased, all references to it become dangling. This is also the case for handles. In fact, consuming an operand invalidates the operand handle as well as any other handle that is associated with any of the same payload operations. The payload IR consideration is recursive: a handle associated with a payload operation _nested_ in the erased one is also invalidated (because e...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/81199


More information about the Mlir-commits mailing list