[Mlir-commits] [mlir] [mlir] update transform dialect tutorials (PR #81199)

Thu Feb 8 14:23:23 PST 2024

https://github.com/ftynse created https://github.com/llvm/llvm-project/pull/81199

Use the "main" transform-interpreter pass instead of the test pass. This, along with the previously introduced debug extension, now allow tutorials to no longer depend on test passes and extensions.

>From 8193e2f4896edfdd874bb0257201aeccb156dbac Mon Sep 17 00:00:00 2001
From: Alex Zinenko <zinenko at google.com>
Date: Thu, 8 Feb 2024 22:21:47 +0000
Subject: [PATCH] [mlir] update transform dialect tutorials

Use the "main" transform-interpreter pass instead of the test pass.
This, along with the previously introduced debug extension, now allow
tutorials to no longer depend on test passes and extensions.
---
 mlir/docs/Tutorials/transform/Ch1.md          | 347 +++++++++---------
 mlir/docs/Tutorials/transform/Ch2.md          | 202 +++++-----
 mlir/docs/Tutorials/transform/Ch3.md          |  12 +-
 mlir/docs/Tutorials/transform/Ch4.md          |   2 +-
 .../Ch2/transform-opt/transform-opt.cpp       |  22 +-
 .../Ch3/transform-opt/transform-opt.cpp       |  26 +-
 .../Ch4/transform-opt/transform-opt.cpp       |  12 -
 .../Dialect/Transform/Transforms/Passes.td    |   4 +
 .../Transforms/TransformInterpreterUtils.h    |   5 +
 .../Dialect/Transform/Utils/RaggedArray.h     |   3 +
 .../Transform/Transforms/InterpreterPass.cpp  |  24 +-
 .../Transforms/TransformInterpreterUtils.cpp  |  36 +-
 .../transform/Ch1/invalidation-1.mlir         |  75 ++--
 .../transform/Ch1/invalidation-2.mlir         |  18 +-
 .../test/Examples/transform/Ch1/sequence.mlir | 105 +++---
 mlir/test/Examples/transform/Ch2/invalid.mlir |  10 +-
 mlir/test/Examples/transform/Ch2/ops.mlir     |  15 +-
 .../test/Examples/transform/Ch2/sequence.mlir |  99 ++---
 mlir/test/Examples/transform/Ch3/invalid.mlir |  10 +-
 mlir/test/Examples/transform/Ch3/ops.mlir     |  28 +-
 .../test/Examples/transform/Ch3/sequence.mlir | 113 +++---
 mlir/test/Examples/transform/ChH/full.mlir    |   6 +-
 22 files changed, 615 insertions(+), 559 deletions(-)

diff --git a/mlir/docs/Tutorials/transform/Ch1.md b/mlir/docs/Tutorials/transform/Ch1.md
index 7a299a48600b8f..b0fdf085854c7f 100644
--- a/mlir/docs/Tutorials/transform/Ch1.md
+++ b/mlir/docs/Tutorials/transform/Ch1.md
@@ -6,7 +6,7 @@ The Transform dialect allows one to precisely target transformations at specific
 
 Transform IR operations operate on values that may be associated with payload IR operations, values or attributes. We call the first two kinds of values operation and value handles, respectively. We call the last kind of values parameters.
 
-The application of transform IR always starts from one top-level operation. In the C++ API, this operation is passed to the `applyTransforms` function. This top-level operation specifies if other transformations should be performed and how. The most common top-level operation merely applies other transform operations listed in its body one after the other.
+The application of transform IR always starts from one top-level operation. In the C++ API, this operation is passed to the `applyTransforms` function. This top-level operation specifies if other transformations should be performed and how. The most common top-level operation, `transform.named_sequence` merely applies other transform operations listed in its body one after the other, similarly to a function or a macro.
 
 Let us illustrate this with a simple sequence of transformations on the common “fully connected + bias + ReLU” ML layer, which boils down to performing a matrix multiplication, followed by an (elementwise) matrix addition and taking an elementwise maximum with 0. This can be expressed using the following IR:
 
@@ -14,7 +14,7 @@ Let us illustrate this with a simple sequence of transformations on the common 
 func.func @fc_relu(%lhs: tensor<512x512xf32>, %rhs: tensor<512x512xf32>,
                    %bias: tensor<512x512xf32>, %output: tensor<512x512xf32>)
                    -> tensor<512x512xf32> {
-  // Matrix-matrix multiplication.  
+  // Matrix-matrix multiplication.
   %matmul = linalg.matmul ins(%lhs, %rhs: tensor<512x512xf32>, tensor<512x512xf32>)
                           outs(%output: tensor<512x512xf32>) -> tensor<512x512xf32>
 
@@ -22,7 +22,7 @@ func.func @fc_relu(%lhs: tensor<512x512xf32>, %rhs: tensor<512x512xf32>,
   %biased = linalg.elemwise_binary { fun = #linalg.binary_fn<add> }
     ins(%matmul, %bias : tensor<512x512xf32>, tensor<512x512xf32>)
     outs(%output : tensor<512x512xf32>) -> tensor<512x512xf32>
-  
+
   // Elementwise max with 0 (ReLU).
   %c0f = arith.constant 0.0 : f32
   %relued = linalg.elemwise_binary { fun = #linalg.binary_fn<max_signed> }
@@ -37,30 +37,34 @@ func.func @fc_relu(%lhs: tensor<512x512xf32>, %rhs: tensor<512x512xf32>,
 For performance reasons, we would like to tile and fuse these operations to exploit cache locality. This is a sequence of transformations that need to be performed one after another, so we naturally start with the corresponding top-level transform operation.
 
 ```mlir
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  transform.yield
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+      %arg0: !transform.any_op,
+      %arg1: !transform.op<"linalg.matmul">,
+      %arg2: !transform.op<"linalg.elemwise_binary">):
+    transform.yield
+  }
 }
 ```
 
 There are several aspects worth noticing in this operation.
 
-The first entry block argument is mandatory for top-level transform operations and is associated with the top-level payload operation that sequence is applied to, for example, a module or a function. This operation is specified when calling `applyTransforms`.
+Its special name, `@__transform_main` and the first argument are mandated by the interpreter pass, similarly to how the entry point of C programs needs to be called `main` and may have the `int (int argc, char** argv)` signature. This argument will be associated with the top-level payload operation, most often the operation that the pass is applied to. Note that none of this is required when applying the transformation _programmatically_ via `applyTransforms` or `applyNamedSequence`.
 
 The remaining entry block arguments are optional and can be associated with payload attributes, operations or values that are useful in the sequence. These are also specified when calling `applyTransforms`. In our case, we are interested in the matrix multiplication and elementwise operations that we are going to tile and fuse.
 
 All value handles have Transform dialect types. These types specify certain properties of the payload IR entities associated with them. In this example, `transform.any_op` indicates that the handle is associated with arbitrary payload operations. On the contrary, `transform.op<"X">` indicates that the handle is associated _only_ with payload operations of kind `X`. These constraints are verified when the handle/payload association is created. For entry block arguments of top-level transform operations, this happens early in the `applyTransforms` function. If the constraints are not satisfied, the transform application fails and produces diagnostics for the user.
 
+Finally, the operation is wrapped in a module with the `transform.with_named_sequence` attribute that triggers all necessary verifications if multiple named sequences exist.
+
 ## Failure Propagation
 
-Speaking about diagnostics, the `sequence` operation itself has a mandatory attribute specifying the failure propagation mode. There are two options:
+The Transform dialect infrastructure has a particular mechanism for handling diagnostics that supports recoverable errors. It is best understood by considering the (unnamed) sequence operation that has a mandatory attribute specifying the failure propagation mode. There are two options:
 
 *   “propagate” makes the sequence transformation fail if any of the nested transformation fails;
 *   “suppress” makes the sequence succeed even if one of the nested transformations fails, but without attempting to perform the transformations following the failed one in the sequence.
 
-This latter allows the transformation to continue despite (recoverable) errors. As we are only building the transformation, it is preferable to propagate failures so we know when something did not apply.
+This latter allows the transformation script surrounding the sequence to continue despite errors within the sequence, assuming they are recoverable. As we are only building the transformation script, it is preferable to propagate failures so we know when something did not apply.
 
 To check or debug a transform sequence, it is possible to print various entities associated with the transform IR values. For example, we can print the operations associated with the handles:
 
@@ -83,27 +87,26 @@ Since we don’t want to recompile the compiler every time we change a transform
 
 
 ```sh
-$ mlir-opt matmul.mlir --pass-pipeline="
-    builtin.module(test-transform-dialect-interpreter{
-        bind-first-extra-to-ops=linalg.matmul
-        bind-second-extra-to-ops=linalg.elemwise_binary})"
+$ mlir-opt sequence.mlir --pass-pipeline="
+    builtin.module(transform-interpreter{
+        debug-bind-trailing-args=linalg.matmul,linalg.elemwise_binary})"
 ```
 
-The `matmul.mlir` file contains _both_ the payload IR function _and_ the transform IR sequence nested in the same module. The transform interpreter will find the first top-level transform operation in the root operation of the pass (the module in our case) and apply it to that root operation. In our case, we also asked the interpreter pass to associate the two extra arguments of the top-level sequence with all `linalg.matmul` and `linalg.elemwise_binary` payload operations through the respective pass options. Running this pass results in the expected remarks:
+The `sequence.mlir` file contains _both_ the payload IR function _and_ the transform IR sequence nested in the same module. The transform interpreter pass will apply the `@__transform_main` named sequence to the anchor operation of the pass. In our case, we also asked the interpreter pass to associate the two extra arguments of the top-level sequence with all `linalg.matmul` and `linalg.elemwise_binary` payload operations through the respective pass options. Running this pass results in the expected remarks:
 
 ```sh
-matmul.mlir:7:13: remark: matmul
+sequence.mlir:7:13: remark: matmul
   %matmul = linalg.matmul ins(%lhs, %rhs: tensor<512x512xf32>, tensor<512x512xf32>)
             ^
-matmul.mlir:7:13: note: see current operation: %0 = linalg.matmul ins(%arg0, %arg1 : tensor<512x512xf32>, tensor<512x512xf32>) outs(%arg3 : tensor<512x512xf32>) -> tensor<512x512xf32>
-matmul.mlir:10:13: remark: elemwise_binaries
+sequence.mlir:7:13: note: see current operation: %0 = linalg.matmul ins(%arg0, %arg1 : tensor<512x512xf32>, tensor<512x512xf32>) outs(%arg3 : tensor<512x512xf32>) -> tensor<512x512xf32>
+sequence.mlir:10:13: remark: elemwise_binaries
   %biased = linalg.elemwise_binary { fun = #linalg.binary_fn<add> }
             ^
-matmul.mlir:10:13: note: see current operation: %1 = linalg.elemwise_binary {fun = #linalg.binary_fn<add>} ins(%0, %arg2 : tensor<512x512xf32>, tensor<512x512xf32>) outs(%arg3 : tensor<512x512xf32>) -> tensor<512x512xf32>
-matmul.mlir:14:13: remark: elemwise_binaries
+sequence.mlir:10:13: note: see current operation: %1 = linalg.elemwise_binary {fun = #linalg.binary_fn<add>} ins(%0, %arg2 : tensor<512x512xf32>, tensor<512x512xf32>) outs(%arg3 : tensor<512x512xf32>) -> tensor<512x512xf32>
+sequence.mlir:14:13: remark: elemwise_binaries
   %relued = linalg.elemwise_binary { fun = #linalg.binary_fn<max_signed> }
             ^
-matmul.mlir:14:13: note: see current operation: %2 = linalg.elemwise_binary {fun = #linalg.binary_fn<max_signed>} ins(%1, %cst : tensor<512x512xf32>, f32) outs(%arg3 : tensor<512x512xf32>) -> tensor<512x512xf32>
+sequence.mlir:14:13: note: see current operation: %2 = linalg.elemwise_binary {fun = #linalg.binary_fn<max_signed>} ins(%1, %cst : tensor<512x512xf32>, f32) outs(%arg3 : tensor<512x512xf32>) -> tensor<512x512xf32>
 ```
 
 Note that `%arg2` is associated with both elementwise payload operations. Any handle is associated with a list of entities. Individual transformations may or may not care about the order of elements in that list.
@@ -114,26 +117,33 @@ Note that `%arg2` is associated with both elementwise payload operations. Any ha
 Now that we have handles to the operations we want to transform, we are ready to apply the transformations. Let us first try tiling the matmul operation itself.
 
 ```mlir
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  // The actual tiling transformation takes tile sizes as attributes.
-  %loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
-    : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
-  transform.yield
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+       %arg0: !transform.any_op,
+       %arg1: !transform.op<"linalg.matmul">,
+       %arg2: !transform.op<"linalg.elemwise_binary">) {
+    // The actual tiling transformation takes tile sizes as attributes.
+    %loop, %tiled = transform.structured.tile_using_forall %arg1
+                    tile_sizes [4, 32]
+      : (!transform.op<"linalg.matmul">)
+     -> (!transform.any_op, !transform.any_op)
+    transform.yield
+  }
 }
 ```
 
-The transformation returns two handles, as indicated in its [documentation](https://mlir.llvm.org/docs/Dialects/Transform/#transformstructuredtile_using_forall-transformtiletoforallop):
+The transformation returns two handles, as indicated in its [documentation](https://mlir.llvm.org/docs/Dialects/Transform/#transformstructuredtile_using_forall-transformtileusingforallop):
 
-*   A handle to the `scf.forall` “multi-for” loop around tensors.
 *   A handle to `linalg.generic` operating on the subset of the original data.
+*   A handle to the `scf.forall` “multi-for” loop around tensors.
 
 Running this transformation with the same command as above expectedly produces the tiled code.
 
 ```mlir
-func.func @fc_relu(%arg0: tensor<512x512xf32>, %arg1: tensor<512x512xf32>, %arg2: tensor<512x512xf32>, %arg3: tensor<512x512xf32>) -> tensor<512x512xf32> {
+func.func @fc_relu(%arg0: tensor<512x512xf32>,
+                   %arg1: tensor<512x512xf32>,
+                   %arg2: tensor<512x512xf32>,
+                   %arg3: tensor<512x512xf32>) -> tensor<512x512xf32> {
   %cst = arith.constant 0.000000e+00 : f32
   %0 = scf.forall (%arg4, %arg5) in (128, 16) shared_outs(%arg6 = %arg3) -> (tensor<512x512xf32>) {
     %3 = affine.apply affine_map<(d0) -> (d0 * 4)>(%arg4)
@@ -144,7 +154,7 @@ func.func @fc_relu(%arg0: tensor<512x512xf32>, %arg1: tensor<512x512xf32>, %arg2
                        : tensor<512x512xf32> to tensor<512x32xf32>
     %extracted_slice_1 = tensor.extract_slice %arg6[%3, %4] [4, 32] [1, 1]
                       : tensor<512x512xf32> to tensor<4x32xf32>
-    %5 = linalg.matmul 
+    %5 = linalg.matmul
          ins(%extracted_slice, %extracted_slice_0
              : tensor<4x512xf32>, tensor<512x32xf32>)
          outs(%extracted_slice_1 : tensor<4x32xf32>) -> tensor<4x32xf32>
@@ -168,78 +178,79 @@ Besides producing new handles, the tiling transform operation _consumes_ the ope
 
 ## Handle Invalidation and Expensive Checks Mode
 
-Undefined behavior is difficult to grapple with when it does happen, so the Transform dialect interpreter provides a set of additional expensive checks that detect most undefined behavior in the transform IR. For example, if we wanted to  use the `%arg1` handle after it is consumed, it would cause undefined behavior that manifests as an assertion in the debug build, and likely as a segmentation fault in the release mode.
+Undefined behavior is difficult to grapple with when it does happen, so the Transform dialect interpreter defaults to performing a set of additional, potentially expensive, checks that detect most undefined behavior in the transform IR. For example, if we wanted to  use the `%arg1` handle after it is consumed, it would cause undefined behavior that manifests as an assertion in the debug build, and likely as a segmentation fault in the release mode.
 
 ```mlir
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  // The actual tiling transformation takes tile sizes as attributes.
-  %loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
-      : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
-
-  // This is trying to use an invalidated handle leading to undefined behavior.
-  transform.debug.emit_remark_at %arg1, "remark" : !transform.op<"linalg.matmul">
-  transform.yield
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+       %arg0: !transform.any_op,
+       %arg1: !transform.op<"linalg.matmul">,
+       %arg2: !transform.op<"linalg.elemwise_binary">) {
+    // The actual tiling transformation takes tile sizes as attributes.
+    %loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
+        : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
+
+    // This is trying to use an invalidated handle leading to undefined behavior.
+    transform.debug.emit_remark_at %arg1, "remark" : !transform.op<"linalg.matmul">
+    transform.yield
+  }
 }
 ```
 
 However, with the expensive checks enabled in the interpreter, a nice diagnostic is produced:
 
 ```sh
-$ mlir-opt matmul.mlir --pass-pipeline="
-    builtin.module(test-transform-dialect-interpreter{
-        bind-first-extra-to-ops=linalg.matmul
-        bind-second-extra-to-ops=linalg.elemwise_binary
-        enable-expensive-checks})"
-```
-
-```sh
-matmul.mlir:28:3: error: op uses a handle invalidated by a previously executed transform op
+sequence.mlir:28:3: error: op uses a handle invalidated by a previously executed transform op
   transform.debug.emit_remark_at %mm, "elemwise_binaries" : !transform.any_op
   ^
-matmul.mlir:26:9: note: handle to invalidated ops
+sequence.mlir:26:9: note: handle to invalidated ops
   %mm = transform.cast %matmul : !transform.op<"linalg.matmul"> to !transform.any_op
         ^
-matmul.mlir:27:19: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
+sequence.mlir:27:19: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
   %loop, %tiled = transform.structured.tile_using_forall %mm tile_sizes [4, 32]
 ```
 
-One may observe that some operations such as `transform.cast` do not consume the operand (because they don’t erase the corresponding operation). So what would happen if we tried to use that operand instead? 
+When compile-time performance is a concern, and the transformation sequence is sufficiently stable, it is possible to disable expensive checks in the interpreter for improved performance by providing the `disable-expensive-checks` option to the pass or by setting the corresponding flag in the `TransformOptions` passed into `applyTransforms`.
+
+One may observe that some operations such as `transform.cast` do not consume the operand (because they don’t erase the corresponding operation). So what would happen if we tried to use that operand instead?
 
 ```mlir
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  // We can cast one type to another as long as operations are compatible
-  // with both types. This creates "aliasing" handles.
-  %casted = transform.cast %arg1 : !transform.op<"linalg.matmul">
-      to !transform.any_op
-
-  // The actual tiling transformation takes tile sizes as attributes.
-  %loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
-    : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
-
-  // Consuming an operand invalidates the consumed handle and any other handle that is
-  // associated with the same payload operations, or payload operations nested in them.
-  transform.debug.emit_remark_at %casted, "remark"
-    : !transform.any_op
-  transform.yield
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main
+       %arg0: !transform.any_op,
+       %arg1: !transform.op<"linalg.matmul">,
+       %arg2: !transform.op<"linalg.elemwise_binary">) {
+    // We can cast one type to another as long as operations are compatible
+    // with both types. This creates "aliasing" handles.
+    %casted = transform.cast %arg1 : !transform.op<"linalg.matmul">
+        to !transform.any_op
+
+    // The actual tiling transformation takes tile sizes as attributes.
+    %loop, %tiled = transform.structured.tile_using_forall %arg1
+                    tile_sizes [4, 32]
+      : (!transform.op<"linalg.matmul">)
+     -> (!transform.any_op, !transform.any_op)
+
+    // Consuming an operand invalidates the consumed handle and any other handle
+    // that is associated with the same payload operations, or payload
+    // operations nested in them.
+    transform.debug.emit_remark_at %casted, "remark"
+      : !transform.any_op
+    transform.yield
+  }
 }
 ```
 
 Both `%arg1` and `%casted` reference the same payload operation. Extending the reference analogy, these references alias. Naturally, when the payload operation is erased, all references to it become dangling. This is also the case for handles. In fact, consuming an operand invalidates the operand handle as well as any other handle that is associated with any of the same payload operations. The payload IR consideration is recursive: a handle associated with a payload operation _nested_ in the erased one is also invalidated (because erasing the operation also erases its regions and all contained operations). The expensive-checks mode can also handle this case.
 
 ```sh
-matmul.mlir:28:3: error: op uses a handle invalidated by a previously executed transform op
+sequence.mlir:28:3: error: op uses a handle invalidated by a previously executed transform op
   transform.debug.emit_remark_at %matmul, "elemwise_binaries" : !transform.op<"linalg.matmul">
   ^
-matmul.mlir:21:29: note: handle to invalidated ops
+sequence.mlir:21:29: note: handle to invalidated ops
 ^bb0(%root: !transform.any_op, %matmul: !transform.op<"linalg.matmul">, %elemwise: !transform.op<"linalg.elemwise_binary">):
                             ^
-matmul.mlir:27:19: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
+sequence.mlir:27:19: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
   %loop, %tiled = transform.structured.tile_using_forall %mm tile_sizes [4, 32]
 ```
 
@@ -248,39 +259,41 @@ matmul.mlir:27:19: note: invalidated by this transform op that consumes its oper
 Going back to the transformation sequence, we have tiled the matrix multiplication, but we also want to tile and fuse the elementwise operations. The typical way of doing in the structured operations paradigm is to tile the last operation in some acyclic dataflow graph, and then progressively fuse the operations that produce its operands. This removes the need to explicitly tile all operations as fusion can adapt their sizes and inject recomputation if desired. So instead of tiling the matmul operation, we are going to tile the last operation in the chain, and then fuse the preceding operations into the loops produced by tiling.
 
 ```mlir
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  // Since the %arg2 handle is associated with both elementwise operations,
-  // we need to split it into two handles so we can target only the second
-  // elementwise operation.
-  %add, %max = transform.split_handle %arg2
-      : (!transform.op<"linalg.elemwise_binary">)
-      -> (!transform.any_op, !transform.any_op)
-
-  // The actual tiling transformation takes tile sizes as attributes. It
-  // produces a handle to the loop generated during tiling.
-  %tiled_max, %loop =
-      transform.structured.tile_using_forall %max tile_sizes [8, 32]
-        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-
-  // We can now fuse the other operations into the loop. Here, we fuse
-  // operations one by one. This requires the operation that is being fused to
-  // define the value used within the loop, so the order of such fusions is
-  // important. We could also use "transform.merge_handles" to obtain a single
-  // handle to all operations and give it to `fuse_into_containing_op` that
-  // would take care of the ordering in this case.
-  %add_fused, %loop_0 =
-      transform.structured.fuse_into_containing_op %add into %loop
-        : (!transform.any_op, !transform.any_op)
-          -> (!transform.any_op, !transform.any_op)
-  %matmul_fused, %loop_1 =
-      transform.structured.fuse_into_containing_op %arg1 into %loop_0
-        : (!transform.op<"linalg.matmul">, !transform.any_op)
-          -> (!transform.any_op, !transform.any_op)
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+       %arg0: !transform.any_op,
+       %arg1: !transform.op<"linalg.matmul">,
+       %arg2: !transform.op<"linalg.elemwise_binary">) {
+    // Since the %arg2 handle is associated with both elementwise operations,
+    // we need to split it into two handles so we can target only the second
+    // elementwise operation.
+    %add, %max = transform.split_handle %arg2
+        : (!transform.op<"linalg.elemwise_binary">)
+        -> (!transform.any_op, !transform.any_op)
 
-  transform.yield
+    // The actual tiling transformation takes tile sizes as attributes. It
+    // produces a handle to the loop generated during tiling.
+    %tiled_max, %loop =
+        transform.structured.tile_using_forall %max tile_sizes [8, 32]
+          : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+
+    // We can now fuse the other operations into the loop. Here, we fuse
+    // operations one by one. This requires the operation that is being fused to
+    // define the value used within the loop, so the order of such fusions is
+    // important. We could also use "transform.merge_handles" to obtain a single
+    // handle to all operations and give it to `fuse_into_containing_op` that
+    // would take care of the ordering in this case.
+    %add_fused, %loop_0 =
+        transform.structured.fuse_into_containing_op %add into %loop
+          : (!transform.any_op, !transform.any_op)
+            -> (!transform.any_op, !transform.any_op)
+    %matmul_fused, %loop_1 =
+        transform.structured.fuse_into_containing_op %arg1 into %loop_0
+          : (!transform.op<"linalg.matmul">, !transform.any_op)
+            -> (!transform.any_op, !transform.any_op)
+
+    transform.yield
+  }
 }
 ```
 
@@ -291,64 +304,68 @@ This achieves the desired tiling and fusion.
 Finally, let us assume there exists an efficient microkernel, or a hardware instruction expressed as an intrinsic function, for a 4x4 matrix multiplication. For this purpose, we need to tile the fused operation to the desired size, and then outline it. The resulting function call can then be replaced with a call to the microkernel.
 
 ```mlir
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  // Since the %arg2 handle is associated with both elementwise operations,
-  // we need to split it into two handles so we can target only the second
-  // elementwise operation.
-  %add, %max = transform.split_handle %arg2
-      : (!transform.op<"linalg.elemwise_binary">)
-        -> (!transform.any_op, !transform.any_op)
-
-  // The actual tiling transformation takes tile sizes as attributes. It
-  // produces a handle to the loop generated during tiling.
-  %tiled, %loop  = transform.structured.tile_using_forall %max tile_sizes [8, 32]
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-
-  // We can now fuse the other operations into the loop. Here, we fuse
-  // operations one by one. This requires the operation that is being fused to
-  // define the value used within the loop, so the order of such fusions is
-  // important. We could also use "transform.merge_handles" to obtain a single
-  // handle to all operations and give it to `fuse_into_containing_op` that
-  // would take care of the ordering in this case.
-  %add_fused, %loop_0 =
-      transform.structured.fuse_into_containing_op %add into %loop
-        : (!transform.any_op, !transform.any_op)
-          -> (!transform.any_op, !transform.any_op)
-  %matmul_fused, %loop_1 =
-      transform.structured.fuse_into_containing_op %arg1 into %loop_0
-        : (!transform.op<"linalg.matmul">, !transform.any_op)
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+       %arg0: !transform.any_op,
+       %arg1: !transform.op<"linalg.matmul">,
+       %arg2: !transform.op<"linalg.elemwise_binary">) {
+    // Since the %arg2 handle is associated with both elementwise operations,
+    // we need to split it into two handles so we can target only the second
+    // elementwise operation.
+    %add, %max = transform.split_handle %arg2
+        : (!transform.op<"linalg.elemwise_binary">)
           -> (!transform.any_op, !transform.any_op)
 
-  // Tile again to get the desired size. Note that this time this tiles the
-  // "add" operation and fuses matmul into the loop, but doesn't affect the
-  // "max" operation. This illustrates the precise targeting with the transform
-  // dialect. Otherwise, it is difficult to differentiate "add" and "max", both
-  // of which having the same kind.
-  %tiled_2, %loop_2 =
-      transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
+    // The actual tiling transformation takes tile sizes as attributes. It
+    // produces a handle to the loop generated during tiling.
+    %tiled, %loop = transform.structured.tile_using_forall %max
+                    tile_sizes [8, 32]
         : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-  %matmul_fused_2, %loop_3 =
-      transform.structured.fuse_into_containing_op %matmul_fused into %loop_2
-        : (!transform.any_op, !transform.any_op)
-          -> (!transform.any_op, !transform.any_op)
 
-  // Since outlining is currently only implemented for region-holding operations
-  // such as loops, use tiling to size 1 to materialize the outer loop that is
-  // going to be outlined.
-  %_, %outline_target =
-      transform.structured.tile_using_forall %tiled_2 tile_sizes [1]
-        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-  transform.structured.fuse_into_containing_op %matmul_fused_2
-      into %outline_target
-        : (!transform.any_op, !transform.any_op)
-          -> (!transform.any_op, !transform.any_op)
-  %func, %call = transform.loop.outline %outline_target {func_name = "outlined"}
-      : (!transform.any_op) -> (!transform.any_op, !transform.op<"func.call">)
-
-  transform.yield
+    // We can now fuse the other operations into the loop. Here, we fuse
+    // operations one by one. This requires the operation that is being fused to
+    // define the value used within the loop, so the order of such fusions is
+    // important. We could also use "transform.merge_handles" to obtain a single
+    // handle to all operations and give it to `fuse_into_containing_op` that
+    // would take care of the ordering in this case.
+    %add_fused, %loop_0 =
+        transform.structured.fuse_into_containing_op %add into %loop
+          : (!transform.any_op, !transform.any_op)
+            -> (!transform.any_op, !transform.any_op)
+    %matmul_fused, %loop_1 =
+        transform.structured.fuse_into_containing_op %arg1 into %loop_0
+          : (!transform.op<"linalg.matmul">, !transform.any_op)
+            -> (!transform.any_op, !transform.any_op)
+
+    // Tile again to get the desired size. Note that this time this tiles the
+    // "add" operation and fuses matmul into the loop, but doesn't affect the
+    // "max" operation. This illustrates the precise targeting with the
+    // transform dialect. Otherwise, it is difficult to differentiate "add" and
+    // "max", both of which having the same kind.
+    %tiled_2, %loop_2 =
+        transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
+          : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+    %matmul_fused_2, %loop_3 =
+        transform.structured.fuse_into_containing_op %matmul_fused into %loop_2
+          : (!transform.any_op, !transform.any_op)
+            -> (!transform.any_op, !transform.any_op)
+
+    // Since outlining is currently only implemented for region-holding
+    // operations such as loops, use tiling to size 1 to materialize the outer
+    // loop that is going to be outlined.
+    %_, %outline_target =
+        transform.structured.tile_using_forall %tiled_2 tile_sizes [1]
+          : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+    transform.structured.fuse_into_containing_op %matmul_fused_2
+        into %outline_target
+          : (!transform.any_op, !transform.any_op)
+            -> (!transform.any_op, !transform.any_op)
+    %func, %call = transform.loop.outline %outline_target
+                   {func_name = "outlined"}
+        : (!transform.any_op) -> (!transform.any_op, !transform.op<"func.call">)
+
+    transform.yield
+  }
 }
 ```
 
diff --git a/mlir/docs/Tutorials/transform/Ch2.md b/mlir/docs/Tutorials/transform/Ch2.md
index ac6d7d42523e41..1aaefd2f2c3075 100644
--- a/mlir/docs/Tutorials/transform/Ch2.md
+++ b/mlir/docs/Tutorials/transform/Ch2.md
@@ -10,37 +10,40 @@ The Transform dialect uses the dialect extension mechanism to allow additional o
 // In MyExtension.cpp.
 #include "mlir/Dialect/Transform/IR/TransformDialect.h"
 
-// Define a new Transform dialect extension. This uses the CRTP idiom to identify
-// extensions.
+// Define a new Transform dialect extension. This uses the CRTP idiom to
+// identify extensions.
 class MyExtension : public ::mlir::transform::TransformDialectExtension<MyExtension> {
 public:
   // The extension must derive the base constructor.
   using Base::Base;
 
-  // This function initializes the extension, similarly to `initialize` in dialect 
-  // definitions. List individual operations and dependent dialects here.
+  // This function initializes the extension, similarly to `initialize` in
+  // dialect  definitions. List individual operations and dependent dialects
+  // here.
   void init();
 };
 
 void MyExtension::init() {
-  // Similarly to dialects, an extension can declare a dependent dialect. This dialect 
-  // will be loaded along with the extension and, therefore, along with the Transform 
-  // dialect. Only declare as dependent the dialects that contain the attributes or 
-  // types used by transform operations. Do NOT declare as dependent the dialects 
-  // produced during the transformation.
+  // Similarly to dialects, an extension can declare a dependent dialect. This
+  // dialect will be loaded along with the extension and, therefore, along with
+  // the Transform  dialect. Only declare as dependent the dialects that contain
+  // the attributes or types used by transform operations. Do NOT declare as
+  // dependent the dialects produced during the transformation.
+  //
   // declareDependentDialect<MyDialect>();
 
-  // When transformations are applied, they may produce new operations from previously
-  // unloaded dialects. Typically, a pass would need to declare itself dependent on
-  // the dialects containing such new operations. To avoid confusion with the dialects
-  // the extension itself depends on, the Transform dialects differentiates between:
+  // When transformations are applied, they may produce new operations from
+  // previously unloaded dialects. Typically, a pass would need to declare
+  // itself dependent on the dialects containing such new operations. To avoid
+  // confusion with the dialects the extension itself depends on, the Transform
+  // dialects differentiates between:
   //   - dependent dialects, which are used by the transform operations, and
-  //   - generated dialects, which contain the entities (attributes, operations, 
-  //     types) that may be produced by applying the transformation even when not
-  //     present in the original payload IR.
-  // In the following chapter, we will be add operations that generate function calls
-  // and structured control flow operations, so let's declare the corresponding
-  // dialects as generated.
+  //   - generated dialects, which contain the entities (attributes, operations,
+  //     types) that may be produced by applying the transformation even when
+  //     not present in the original payload IR.
+  // In the following chapter, we will be add operations that generate function
+  // calls and structured control flow operations, so let's declare the
+  // corresponding dialects as generated.
   declareGeneratedDialect<::mlir::scf::SCFDialect>();
   declareGeneratedDialect<::mlir::func::FuncDialect>();
 
@@ -89,7 +92,7 @@ mlir_tablegen(MyExtension.cpp.inc -gen-op-defs)
 # Add a CMakeTarget we can depend on to ensure the generation happens before the compilation.
 add_public_tablegen_target(MyExtensionIncGen)
 
-# Don't forget to generate the documentation, this will produce a MyExtension.md under 
+# Don't forget to generate the documentation, this will produce a MyExtension.md under
 # Dialects.
 add_mlir_doc(MyExtension MyExtension Dialects/ -gen-op-doc)
 ```
@@ -103,7 +106,8 @@ add_mlir_library(
   # Built from the following source files.
   MyExtension.cpp
 
-  # Make sure ODS declaration and definitions are generated before compiling this.
+  # Make sure ODS declaration and definitions are generated before compiling
+  # this.
   DEPENDS
   MyExtensionIncGen
 
@@ -136,10 +140,10 @@ This will generate two files, `MyExtension.h.inc` and `MyExtension.cpp.inc`, tha
 void MyExtension::init() {
   // …
 
-  // Finally, we register the additional transform operations with the dialect. List all 
-  // operations generated from ODS. This call will perform additional checks that the 
-  // operations implement the transform and memory effect interfaces required by the 
-  // dialect interpreter and assert if they do not.
+  // Finally, we register the additional transform operations with the dialect.
+  // List all  operations generated from ODS. This call will perform additional
+  // checks that the  operations implement the transform and memory effect
+  // interfaces required by the dialect interpreter and assert if they do not.
   registerTransformOps<
 #define GET_OP_LIST
 #include "MyExtension.cpp.inc"
@@ -154,34 +158,36 @@ With this setup, we are now ready to define the new transform operation to rewri
 ```tablegen
 // In MyExtension.td.
 
-// Define the new operation. By convention, prefix its name with the name of the dialect 
-// extension, "my.". The full operation name will be further prefixed with "transform.".
+// Define the new operation. By convention, prefix its name with the name of the
+// dialect  extension, "my.". The full operation name will be further prefixed
+// with "transform.".
 def ChangeCallTargetOp : Op<Transform_Dialect, "my.change_call_target",
-    // Indicate that the operation implements the required TransformOpInterface and
-    // MemoryEffectsOpInterface.
+    // Indicate that the operation implements the required TransformOpInterface
+    // and MemoryEffectsOpInterface.
     [DeclareOpInterfaceMethods<TransformOpInterface>,
      DeclareOpInterfaceMethods<MemoryEffectsOpInterface>]> {
-  // Provide a brief and a full description. It is recommended that the latter describes 
-  // the effects on the operands and how the operation processes various failure modes.
+  // Provide a brief and a full description. It is recommended that the latter
+  // describes the effects on the operands and how the operation processes
+  // various failure modes.
   let summary = "Changes the callee of a call operation to the specified one";
   let description = [{
-    For each `func.call` payload operation associated with the handle, changes its 
-    callee to be the symbol whose name is provided as an attribute to this operation.
+    For each `func.call` payload operation associated with the handle, changes
+    its callee to be the symbol whose name is provided as an attribute to this operation.
 
-    Generates a silenceable failure if the operand is associated with payload operations 
-    that are not `func.call`.
-    Only reads the operand.
+    Generates a silenceable failure if the operand is associated with payload operations that are not `func.call`. Only reads the operand.
   }];
 
-  // The arguments include the handle to the payload operations and the attribute that 
-  // specifies the new callee. The handle must implement TransformHandleTypeInterface.   
-  // We use a string attribute as the symbol may not exist in the transform IR so the 
-  // verification may fail. 
+  // The arguments include the handle to the payload operations and the
+  // attribute that specifies the new callee. The handle must implement
+  // TransformHandleTypeInterface.
+  // We use a string attribute as the symbol may not exist in the transform IR
+  // so the verification may fail.
   let arguments = (ins
     TransformHandleTypeInterface:$call,
     StrAttr:$new_target);
 
-  // The results are empty as the transformation does not produce any new payload.
+  // The results are empty as the transformation does not produce any new
+  // payload.
   let results = (outs);
 
   // Provide nice syntax.
@@ -224,8 +230,8 @@ must be modified with the provided rewriter.
     // It can also carry additional user-defined state.
     ::mlir::transform::TransformState &state) {
 
-  // First, we need to obtain the list of payload operations that are associated with
-  // the operand handle.
+  // First, we need to obtain the list of payload operations that are associated
+  // with the operand handle.
   auto payload = state.getPayloadOps(getCall());
 
   // Then, we iterate over the list of operands and call the actual IR-mutating
@@ -280,56 +286,66 @@ void registerMyExtension(::mlir::DialectRegistry &registry) {
 After registering the extension, it becomes possible to use our new operation in the Transform dialect interpreter. The upstream testing pass can be used as is.
 
 ```mlir
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  // Since the %arg2 handle is associated with both elementwise operations,
-  // we need to split it into two handles so we can target only the second
-  // elementwise operation.
-  %add, %max = transform.split_handle %arg2 : (!transform.op<"linalg.elemwise_binary">)
-      -> (!transform.any_op, !transform.any_op)
-
-  // The actual tiling transformation takes tile sizes as attributes. It produces a
-  // handle to the loop generated during tiling.
-  %loop, %tiled = transform.structured.tile_using_forall %max tile_sizes [8, 32]
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-
-  // We can now fuse the other operations into the loop. Here, we fuse
-  // operations one-by-one. This requires the operation that is being fused
-  // to define the value used within the loop, so the order of such fusions
-  // is important. We could also use "transform.merge_handles" to obtain
-  // a single handle to all operations and give it to `fuse_into_containing_op`
-  // that would take care of the ordering in this case.
-  %add_fused = transform.structured.fuse_into_containing_op %add into %loop
-      : (!transform.any_op, !transform.any_op) -> !transform.any_op
-  %matmul_fused = transform.structured.fuse_into_containing_op %arg1 into %loop
-      : (!transform.op<"linalg.matmul">, !transform.any_op) -> !transform.any_op
-
-  // Tile again to get the desired size. Note that this time this tiles the
-  // "add" operation and fuses matmul into the loop, but doesn't affect the
-  // "max" operation. This illustrates the precise targeting with the transform
-  // dialect. Otherwise, it is difficult to differentiate "add" and "max", both
-  // of which having the same kind.
-  %loop_2, %tiled_2 = transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-  %matmul_fused_2 = transform.structured.fuse_into_containing_op %matmul_fused into %loop_2
-      : (!transform.any_op, !transform.any_op) -> !transform.any_op
-
-  // Since outlining is currently only implemented for region-holding operations
-  // such as loops, use tiling to size 1 to materialize the outer loop that is
-  // going to be outlined.
-  %outline_target, %_ = transform.structured.tile_using_forall %tiled_2 tile_sizes [1]
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-  transform.structured.fuse_into_containing_op %matmul_fused_2 into %outline_target
-      : (!transform.any_op, !transform.any_op) -> !transform.any_op
-  %func, %call = transform.loop.outline %outline_target {func_name = "outlined"}
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-
-  // Rewrite the call target.
-  transform.my.change_call_target %call, "microkernel" : !transform.any_op
-
-  transform.yield
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+      %arg0: !transform.any_op,
+      %arg1: !transform.op<"linalg.matmul">,
+      %arg2: !transform.op<"linalg.elemwise_binary">) {
+    // Since the %arg2 handle is associated with both elementwise operations,
+    // we need to split it into two handles so we can target only the second
+    // elementwise operation.
+    %add, %max = transform.split_handle %arg2
+        : (!transform.op<"linalg.elemwise_binary">)
+        -> (!transform.any_op, !transform.any_op)
+
+    // The actual tiling transformation takes tile sizes as attributes. It
+    // produces a handle to the loop generated during tiling.
+    %loop, %tiled = transform.structured.tile_using_forall %max
+                    tile_sizes [8, 32]
+        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+
+    // We can now fuse the other operations into the loop. Here, we fuse
+    // operations one-by-one. This requires the operation that is being fused
+    // to define the value used within the loop, so the order of such fusions
+    // is important. We could also use "transform.merge_handles" to obtain
+    // a single handle to all operations and give it to
+    // `fuse_into_containing_op` that would take care of the ordering in this
+    // case.
+    %add_fused = transform.structured.fuse_into_containing_op %add into %loop
+        : (!transform.any_op, !transform.any_op) -> !transform.any_op
+    %matmul_fused = transform.structured.fuse_into_containing_op %arg1
+                    into %loop
+        : (!transform.op<"linalg.matmul">, !transform.any_op)
+       -> !transform.any_op
+
+    // Tile again to get the desired size. Note that this time this tiles the
+    // "add" operation and fuses matmul into the loop, but doesn't affect the
+    // "max" operation. This illustrates the precise targeting with the
+    // transform dialect. Otherwise, it is difficult to differentiate "add" and
+    // "max", both of which having the same kind.
+    %loop_2, %tiled_2 = transform.structured.tile_using_forall %add_fused
+                        tile_sizes [4, 4]
+        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+    %matmul_fused_2 = transform.structured.fuse_into_containing_op %matmul_fused
+                      into %loop_2
+        : (!transform.any_op, !transform.any_op) -> !transform.any_op
+
+    // Since outlining is currently only implemented for region-holding
+    // operations such as loops, use tiling to size 1 to materialize the outer
+    // loop that is going to be outlined.
+    %outline_target, %_ = transform.structured.tile_using_forall %tiled_2 tile_sizes [1]
+        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+    transform.structured.fuse_into_containing_op %matmul_fused_2 into %outline_target
+        : (!transform.any_op, !transform.any_op) -> !transform.any_op
+    %func, %call = transform.loop.outline %outline_target
+                   {func_name = "outlined"}
+        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+
+    // Rewrite the call target.
+    transform.my.change_call_target %call, "microkernel" : !transform.any_op
+
+    transform.yield
+  }
 }
 ```
 
diff --git a/mlir/docs/Tutorials/transform/Ch3.md b/mlir/docs/Tutorials/transform/Ch3.md
index 84251df383d83f..fa788d13e2055e 100644
--- a/mlir/docs/Tutorials/transform/Ch3.md
+++ b/mlir/docs/Tutorials/transform/Ch3.md
@@ -79,7 +79,7 @@ def CallOpInterfaceHandle
       // The type must implement `TransformHandleTypeInterface`.
       [DeclareTypeInterfaceMethods<TransformHandleTypeInterface>]> {
 
-  // The usual components of a type such as description, mnemonic and assembly format 
+  // The usual components of a type such as description, mnemonic and assembly format
   // should be provided.
   let summary = "handle to payload operations implementing CallOpInterface";
   let mnemonic = "my.call_op_interface";
@@ -87,7 +87,7 @@ def CallOpInterfaceHandle
 }
 ```
 
-We will omit the generation of declaration and definitions using Tablegen for brevity as it is identical to the regular case. 
+We will omit the generation of declaration and definitions using Tablegen for brevity as it is identical to the regular case.
 
 To finalize the definition of a transform type, one must implement the interface methods.
 
@@ -109,9 +109,9 @@ mlir::transform::CallOpInterfaceHandleType::checkPayload(
     if (llvm::isa<mlir::CallOpInterface>(op))
       continue;
 
-    // By convention, these verifiers always emit a silenceable failure since they are 
+    // By convention, these verifiers always emit a silenceable failure since they are
     // checking a precondition.
-    DiagnosedSilenceableFailure diag = emitSilenceableError(loc) 
+    DiagnosedSilenceableFailure diag = emitSilenceableError(loc)
         << "expected the payload operation to implement CallOpInterface";
     diag.attachNote(op->getLoc()) << "offending operation";
     return diag;
@@ -129,8 +129,8 @@ Additional attributes and types need to be registered in the extension, next to
 // In MyExtension.cpp.
 
 void MyExtension::init() {
-  // …
-  
+  // ...
+
   registerTypes<
 #define GET_TYPEDEF_LIST
 #include "MyExtensionTypes.cpp.inc"
diff --git a/mlir/docs/Tutorials/transform/Ch4.md b/mlir/docs/Tutorials/transform/Ch4.md
index 9c9aba19d5745c..ad5221c6f6ccaf 100644
--- a/mlir/docs/Tutorials/transform/Ch4.md
+++ b/mlir/docs/Tutorials/transform/Ch4.md
@@ -205,7 +205,7 @@ transform.named_sequence @__transform_main(
     %root: !transform.any_op {transform.readonly}) {
   // Collect groups of operations that match the criteria specified in the
   // named sequence.
-  %matmul, %el1, %el2 = transform.collect_matching @match_matmul_elemwise in %root 
+  %matmul, %el1, %el2 = transform.collect_matching @match_matmul_elemwise in %root
     : (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op)
   %elemwise = transform.merge_handles %el1, %el2 : !transform.any_op
 
diff --git a/mlir/examples/transform/Ch2/transform-opt/transform-opt.cpp b/mlir/examples/transform/Ch2/transform-opt/transform-opt.cpp
index 3a975313f93ff0..874ad78c7e8377 100644
--- a/mlir/examples/transform/Ch2/transform-opt/transform-opt.cpp
+++ b/mlir/examples/transform/Ch2/transform-opt/transform-opt.cpp
@@ -12,6 +12,7 @@
 
 #include "MyExtension.h"
 
+#include "mlir/Dialect/Transform/Transforms/Passes.h"
 #include "mlir/IR/DialectRegistry.h"
 #include "mlir/IR/MLIRContext.h"
 #include "mlir/InitAllDialects.h"
@@ -20,14 +21,6 @@
 #include "mlir/Transforms/Passes.h"
 #include <cstdlib>
 
-// Forward declarations of test passes that used in this chapter for
-// illustrative purposes. Test passes are not directly exposed for use in
-// binaries other than mlir-opt, which is too big to serve as an example.
-namespace mlir::test {
-void registerTestTransformDialectEraseSchedulePass();
-void registerTestTransformDialectInterpreterPass();
-} // namespace mlir::test
-
 namespace test {
 void registerTestTransformDialectExtension(mlir::DialectRegistry &);
 } // namespace test
@@ -39,22 +32,15 @@ int main(int argc, char **argv) {
   mlir::registerAllExtensions(registry);
   registerMyExtension(registry);
 
+  // Register transform interpreter pass.
+  mlir::transform::registerInterpreterPass();
+
   // Register a handful of cleanup passes that we can run to make the output IR
   // look nicer.
   mlir::registerCanonicalizerPass();
   mlir::registerCSEPass();
   mlir::registerSymbolDCEPass();
 
-  // Register the test passes.
-#ifdef MLIR_INCLUDE_TESTS
-  mlir::test::registerTestTransformDialectEraseSchedulePass();
-  mlir::test::registerTestTransformDialectInterpreterPass();
-  test::registerTestTransformDialectExtension(registry);
-#else
-  llvm::errs() << "warning: MLIR built without test passes, interpreter "
-                  "testing will not be available\n";
-#endif // MLIR_INCLUDE_TESTS
-
   // Delegate to the MLIR utility for parsing and pass management.
   return mlir::MlirOptMain(argc, argv, "transform-opt-ch2", registry)
                  .succeeded()
diff --git a/mlir/examples/transform/Ch3/transform-opt/transform-opt.cpp b/mlir/examples/transform/Ch3/transform-opt/transform-opt.cpp
index 3c348c663abad4..c9150c64a7163d 100644
--- a/mlir/examples/transform/Ch3/transform-opt/transform-opt.cpp
+++ b/mlir/examples/transform/Ch3/transform-opt/transform-opt.cpp
@@ -12,6 +12,7 @@
 
 #include "MyExtension.h"
 
+#include "mlir/Dialect/Transform/Transforms/Passes.h"
 #include "mlir/IR/DialectRegistry.h"
 #include "mlir/IR/MLIRContext.h"
 #include "mlir/InitAllDialects.h"
@@ -20,18 +21,6 @@
 #include "mlir/Transforms/Passes.h"
 #include <cstdlib>
 
-// Forward declarations of test passes that used in this chapter for
-// illustrative purposes. Test passes are not directly exposed for use in
-// binaries other than mlir-opt, which is too big to serve as an example.
-namespace mlir::test {
-void registerTestTransformDialectEraseSchedulePass();
-void registerTestTransformDialectInterpreterPass();
-} // namespace mlir::test
-
-namespace test {
-void registerTestTransformDialectExtension(mlir::DialectRegistry &);
-} // namespace test
-
 int main(int argc, char **argv) {
   // Register all "core" dialects and our transform dialect extension.
   mlir::DialectRegistry registry;
@@ -39,22 +28,15 @@ int main(int argc, char **argv) {
   mlir::registerAllExtensions(registry);
   registerMyExtension(registry);
 
+  // Register the interpreter pass.
+  mlir::transform::registerInterpreterPass();
+
   // Register a handful of cleanup passes that we can run to make the output IR
   // look nicer.
   mlir::registerCanonicalizerPass();
   mlir::registerCSEPass();
   mlir::registerSymbolDCEPass();
 
-  // Register the test passes.
-#ifdef MLIR_INCLUDE_TESTS
-  mlir::test::registerTestTransformDialectEraseSchedulePass();
-  mlir::test::registerTestTransformDialectInterpreterPass();
-  test::registerTestTransformDialectExtension(registry);
-#else
-  llvm::errs() << "warning: MLIR built without test passes, interpreter "
-                  "testing will not be available\n";
-#endif // MLIR_INCLUDE_TESTS
-
   // Delegate to the MLIR utility for parsing and pass management.
   return mlir::MlirOptMain(argc, argv, "transform-opt-ch3", registry)
                  .succeeded()
diff --git a/mlir/examples/transform/Ch4/transform-opt/transform-opt.cpp b/mlir/examples/transform/Ch4/transform-opt/transform-opt.cpp
index 10190664b51cdf..03c84bdbccb8ba 100644
--- a/mlir/examples/transform/Ch4/transform-opt/transform-opt.cpp
+++ b/mlir/examples/transform/Ch4/transform-opt/transform-opt.cpp
@@ -21,10 +21,6 @@
 #include "mlir/Transforms/Passes.h"
 #include <cstdlib>
 
-namespace test {
-void registerTestTransformDialectExtension(mlir::DialectRegistry &);
-} // namespace test
-
 int main(int argc, char **argv) {
   // Register all "core" dialects and our transform dialect extension.
   mlir::DialectRegistry registry;
@@ -39,14 +35,6 @@ int main(int argc, char **argv) {
   mlir::registerSymbolDCEPass();
   mlir::transform::registerInterpreterPass();
 
-  // Register the test passes.
-#ifdef MLIR_INCLUDE_TESTS
-  test::registerTestTransformDialectExtension(registry);
-#else
-  llvm::errs() << "warning: MLIR built without test extension, interpreter "
-                  "testing will not be available\n";
-#endif // MLIR_INCLUDE_TESTS
-
   // Delegate to the MLIR utility for parsing and pass management.
   return mlir::MlirOptMain(argc, argv, "transform-opt-ch4", registry)
                  .succeeded()
diff --git a/mlir/include/mlir/Dialect/Transform/Transforms/Passes.td b/mlir/include/mlir/Dialect/Transform/Transforms/Passes.td
index c3436fd6824270..1d6eb24156e334 100644
--- a/mlir/include/mlir/Dialect/Transform/Transforms/Passes.td
+++ b/mlir/include/mlir/Dialect/Transform/Transforms/Passes.td
@@ -75,6 +75,10 @@ def InterpreterPass : Pass<"transform-interpreter"> {
            "Select the operation with 'transform.target_tag' attribute having "
            "the given value as payload IR root. If empty select the pass "
            "anchor operation as the payload IR root.">,
+    ListOption<"debugBindTrailingArgs", "debug-bind-trailing-args",
+               "std::string",
+               "Binds trailing arguments of the entry point to the payload "
+               "operations with specified names.">,
     Option<"disableExpensiveChecks", "disable-expensive-checks", "bool",
            "false",
            "Disable expensive checks in the interpreter for a faster run.">,
diff --git a/mlir/include/mlir/Dialect/Transform/Transforms/TransformInterpreterUtils.h b/mlir/include/mlir/Dialect/Transform/Transforms/TransformInterpreterUtils.h
index 1737d72838e9b3..738e0c533c6b51 100644
--- a/mlir/include/mlir/Dialect/Transform/Transforms/TransformInterpreterUtils.h
+++ b/mlir/include/mlir/Dialect/Transform/Transforms/TransformInterpreterUtils.h
@@ -84,6 +84,11 @@ LogicalResult applyTransformNamedSequence(Operation *payload,
                                           ModuleOp transformModule,
                                           const TransformOptions &options);
 
+LogicalResult applyTransformNamedSequence(RaggedArray<MappedValue> bindings,
+                                          TransformOpInterface transformRoot,
+                                          ModuleOp transformModule,
+                                          const TransformOptions &options);
+
 } // namespace transform
 } // namespace mlir
 
diff --git a/mlir/include/mlir/Dialect/Transform/Utils/RaggedArray.h b/mlir/include/mlir/Dialect/Transform/Utils/RaggedArray.h
index 0ee23914fa4e1d..3d4083bdd073e4 100644
--- a/mlir/include/mlir/Dialect/Transform/Utils/RaggedArray.h
+++ b/mlir/include/mlir/Dialect/Transform/Utils/RaggedArray.h
@@ -150,6 +150,9 @@ class RaggedArray {
     slices.resize(slices.size() + num, std::pair<size_t, size_t>(-1, 0));
   }
 
+  /// Removes the first subarray in-place. Invalidates iterators to all rows.
+  void removeFront() { slices.erase(slices.begin()); }
+
 private:
   /// Appends the given elements to the storage and returns an ArrayRef
   /// pointing to them in the storage.
diff --git a/mlir/lib/Dialect/Transform/Transforms/InterpreterPass.cpp b/mlir/lib/Dialect/Transform/Transforms/InterpreterPass.cpp
index c875519945b921..5073234a7e35e9 100644
--- a/mlir/lib/Dialect/Transform/Transforms/InterpreterPass.cpp
+++ b/mlir/lib/Dialect/Transform/Transforms/InterpreterPass.cpp
@@ -7,6 +7,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "mlir/Dialect/Transform/IR/TransformDialect.h"
+#include "mlir/Dialect/Transform/IR/TransformInterfaces.h"
 #include "mlir/Dialect/Transform/Transforms/Passes.h"
 #include "mlir/Dialect/Transform/Transforms/TransformInterpreterUtils.h"
 
@@ -64,6 +65,20 @@ class InterpreterPass
         transform::detail::getPreloadedTransformModule(context);
     Operation *payloadRoot =
         findPayloadRoot(getOperation(), debugPayloadRootTag);
+    if (!payloadRoot)
+      return signalPassFailure();
+    auto debugBindNames = llvm::map_to_vector(
+        debugBindTrailingArgs,
+        [&](const std::string &name) { return OperationName(name, context); });
+    SmallVector<SmallVector<Operation *>, 2> trailingBindings;
+    trailingBindings.resize(debugBindNames.size());
+    payloadRoot->walk([&](Operation *payload) {
+      for (auto &&[position, name] : llvm::enumerate(debugBindNames)) {
+        if (payload->getName() == name)
+          trailingBindings[position].push_back(payload);
+      }
+    });
+
     Operation *transformEntryPoint = transform::detail::findTransformEntryPoint(
         getOperation(), transformModule, entryPoint);
     if (!transformEntryPoint) {
@@ -73,8 +88,15 @@ class InterpreterPass
       return signalPassFailure();
     }
 
+    RaggedArray<transform::MappedValue> bindings;
+    bindings.push_back(ArrayRef<Operation *>{payloadRoot});
+    for (SmallVector<Operation *> &trailing : trailingBindings)
+      bindings.push_back(std::move(trailing));
+
     if (failed(transform::applyTransformNamedSequence(
-            payloadRoot, transformEntryPoint, transformModule,
+            bindings,
+            cast<transform::TransformOpInterface>(transformEntryPoint),
+            transformModule,
             options.enableExpensiveChecks(!disableExpensiveChecks)))) {
       return signalPassFailure();
     }
diff --git a/mlir/lib/Dialect/Transform/Transforms/TransformInterpreterUtils.cpp b/mlir/lib/Dialect/Transform/Transforms/TransformInterpreterUtils.cpp
index 2f74b76f07b77b..8a9cd7c52d82c9 100644
--- a/mlir/lib/Dialect/Transform/Transforms/TransformInterpreterUtils.cpp
+++ b/mlir/lib/Dialect/Transform/Transforms/TransformInterpreterUtils.cpp
@@ -191,22 +191,46 @@ LogicalResult transform::detail::assembleTransformLibraryFromPaths(
 LogicalResult transform::applyTransformNamedSequence(
     Operation *payload, Operation *transformRoot, ModuleOp transformModule,
     const TransformOptions &options) {
+  RaggedArray<MappedValue> bindings;
+  bindings.push_back(ArrayRef<Operation *>{payload});
+  return applyTransformNamedSequence(bindings,
+                                     cast<TransformOpInterface>(transformRoot),
+                                     transformModule, options);
+}
+
+LogicalResult transform::applyTransformNamedSequence(
+    RaggedArray<MappedValue> bindings, TransformOpInterface transformRoot,
+    ModuleOp transformModule, const TransformOptions &options) {
+  if (bindings.empty()) {
+    return transformRoot.emitError()
+           << "expected at least one binding for the root";
+  }
+  if (bindings.at(0).size() != 1) {
+    return transformRoot.emitError()
+           << "expected one payload to be bound to the first argument, got "
+           << bindings.at(0).size();
+  }
+  auto *payloadRoot = bindings.at(0).front().dyn_cast<Operation *>();
+  if (!payloadRoot) {
+    return transformRoot->emitError() << "expected the object bound to the "
+                                         "first argument to be an operation";
+  }
+
+  bindings.removeFront();
+
   // `transformModule` may not be modified.
   if (transformModule && !transformModule->isAncestor(transformRoot)) {
     OwningOpRef<Operation *> clonedTransformModule(transformModule->clone());
     if (failed(detail::mergeSymbolsInto(
             SymbolTable::getNearestSymbolTable(transformRoot),
             std::move(clonedTransformModule)))) {
-      return payload->emitError() << "failed to merge symbols";
+      return payloadRoot->emitError() << "failed to merge symbols";
     }
   }
 
   LLVM_DEBUG(DBGS() << "Apply\n" << *transformRoot << "\n");
-  LLVM_DEBUG(DBGS() << "To\n" << *payload << "\n");
+  LLVM_DEBUG(DBGS() << "To\n" << *payloadRoot << "\n");
 
-  // Apply the transform to the IR, do not enforce top-level constraints.
-  RaggedArray<MappedValue> noExtraMappings;
-  return applyTransforms(payload, cast<TransformOpInterface>(transformRoot),
-                         noExtraMappings, options,
+  return applyTransforms(payloadRoot, transformRoot, bindings, options,
                          /*enforceToplevelTransformOp=*/false);
 }
diff --git a/mlir/test/Examples/transform/Ch1/invalidation-1.mlir b/mlir/test/Examples/transform/Ch1/invalidation-1.mlir
index 69b10aed49e31a..2264ade7f9b77c 100644
--- a/mlir/test/Examples/transform/Ch1/invalidation-1.mlir
+++ b/mlir/test/Examples/transform/Ch1/invalidation-1.mlir
@@ -1,8 +1,7 @@
 // RUN: mlir-opt %s \
-// RUN:   --pass-pipeline="builtin.module(test-transform-dialect-interpreter{ \
-// RUN:        bind-first-extra-to-ops=linalg.matmul \
-// RUN:        bind-second-extra-to-ops=linalg.elemwise_binary \
-// RUN:        enable-expensive-checks},canonicalize,cse,symbol-dce)" \
+// RUN:   --pass-pipeline="builtin.module(transform-interpreter{ \
+// RUN:        debug-bind-trailing-args=linalg.matmul,linalg.elemwise_binary},\
+// RUN:        canonicalize,cse,symbol-dce)" \
 // RUN:   --split-input-file --verify-diagnostics
 
 // ****************************** IMPORTANT NOTE ******************************
@@ -12,20 +11,22 @@
 //
 // ****************************************************************************
 
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     // expected-note @below {{handle to invalidated ops}}
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  // The actual tiling transformation takes tile sizes as attributes.
-  // expected-note @below {{invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them}}
-  %tiled, %loop = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
-      : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+      %arg0: !transform.any_op,
+      // expected-note @below {{handle to invalidated ops}}
+      %arg1: !transform.op<"linalg.matmul">,
+      %arg2: !transform.op<"linalg.elemwise_binary">) {
+    // The actual tiling transformation takes tile sizes as attributes.
+    // expected-note @below {{invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them}}
+    %tiled, %loop = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
+        : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
 
-  // This is trying to use an invalidated handle leading to undefined behavior.
-  // expected-error @below {{uses a handle invalidated by a previously executed transform op}}
-  transform.debug.emit_remark_at %arg1, "remark" : !transform.op<"linalg.matmul">
-  transform.yield
+    // This is trying to use an invalidated handle leading to undefined behavior.
+    // expected-error @below {{uses a handle invalidated by a previously executed transform op}}
+    transform.debug.emit_remark_at %arg1, "remark" : !transform.op<"linalg.matmul">
+    transform.yield
+  }
 }
 
 // Original function to optimize.
@@ -52,27 +53,29 @@ func.func @fc_relu(%lhs: tensor<512x512xf32>, %rhs: tensor<512x512xf32>,
 
 // -----
 
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  // We can cast one type to another as long as operations are compatible
-  // with both types. This creates "aliasing" handles.
-  // expected-note @below {{handle to invalidated ops}}
-  %casted = transform.cast %arg1 : !transform.op<"linalg.matmul"> to
-      !transform.any_op
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+      %arg0: !transform.any_op,
+      %arg1: !transform.op<"linalg.matmul">,
+      %arg2: !transform.op<"linalg.elemwise_binary">) {
+    // We can cast one type to another as long as operations are compatible
+    // with both types. This creates "aliasing" handles.
+    // expected-note @below {{handle to invalidated ops}}
+    %casted = transform.cast %arg1 : !transform.op<"linalg.matmul"> to
+        !transform.any_op
 
-  // The actual tiling transformation takes tile sizes as attributes.
-  // expected-note @below {{invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them}}
-  %tiled, %loop = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
-    : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
+    // The actual tiling transformation takes tile sizes as attributes.
+    // expected-note @below {{invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them}}
+    %tiled, %loop = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
+      : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
 
-  // Consuming an operand invalidates the consumed handle and any other handle that is
-  // associated with the same payload operations, or payload operations nested in them.
-  // expected-error @below {{uses a handle invalidated by a previously executed transform op}}
-  transform.debug.emit_remark_at %casted, "remark"
-    : !transform.any_op
-  transform.yield
+    // Consuming an operand invalidates the consumed handle and any other handle that is
+    // associated with the same payload operations, or payload operations nested in them.
+    // expected-error @below {{uses a handle invalidated by a previously executed transform op}}
+    transform.debug.emit_remark_at %casted, "remark"
+      : !transform.any_op
+    transform.yield
+  }
 }
 
 // Original function to optimize.
diff --git a/mlir/test/Examples/transform/Ch1/invalidation-2.mlir b/mlir/test/Examples/transform/Ch1/invalidation-2.mlir
index c4a2f1eea46c08..0a84a5cb6c68a5 100644
--- a/mlir/test/Examples/transform/Ch1/invalidation-2.mlir
+++ b/mlir/test/Examples/transform/Ch1/invalidation-2.mlir
@@ -1,10 +1,8 @@
 // RUN: mlir-opt %s \
-// RUN:   --pass-pipeline="builtin.module(test-transform-dialect-interpreter{ \
-// RUN:        bind-first-extra-to-ops=linalg.matmul \
-// RUN:        bind-second-extra-to-ops=linalg.elemwise_binary \
-// RUN:        enable-expensive-checks},canonicalize,cse,symbol-dce)" \
+// RUN:   --pass-pipeline="builtin.module(transform-interpreter{ \
+// RUN:        debug-bind-trailing-args=linalg.matmul,linalg.elemwise_binary},\
+// RUN:        canonicalize,cse,symbol-dce)" \
 // RUN:   --split-input-file --verify-diagnostics
-
 // ****************************** IMPORTANT NOTE ******************************
 //
 // If you are changing this file, you may also need to change
@@ -45,10 +43,11 @@ func.func private @microkernel(
     %init: tensor<4x4xf32>,
     %output: tensor<4x4xf32>) -> tensor<4x4xf32>
 
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+      %arg0: !transform.any_op,
+      %arg1: !transform.op<"linalg.matmul">,
+      %arg2: !transform.op<"linalg.elemwise_binary">) {
   // Since the %arg2 handle is associated with both elementwise operations,
   // we need to split it into two handles so we can target only the second
   // elementwise operation.
@@ -99,4 +98,5 @@ transform.sequence failures(propagate) {
   transform.debug.emit_remark_at %f, "fused" : !transform.any_op
 
   transform.yield
+  }
 }
diff --git a/mlir/test/Examples/transform/Ch1/sequence.mlir b/mlir/test/Examples/transform/Ch1/sequence.mlir
index 5de6e6e096f482..3107adccf78fdc 100644
--- a/mlir/test/Examples/transform/Ch1/sequence.mlir
+++ b/mlir/test/Examples/transform/Ch1/sequence.mlir
@@ -1,8 +1,7 @@
 // RUN: mlir-opt %s \
-// RUN:   --pass-pipeline="builtin.module(test-transform-dialect-interpreter{ \
-// RUN:        bind-first-extra-to-ops=linalg.matmul \
-// RUN:        bind-second-extra-to-ops=linalg.elemwise_binary \
-// RUN:        enable-expensive-checks},canonicalize,cse,symbol-dce)" |\
+// RUN:   --pass-pipeline="builtin.module(transform-interpreter{ \
+// RUN:        debug-bind-trailing-args=linalg.matmul,linalg.elemwise_binary},\
+// RUN:        canonicalize,cse,symbol-dce)" |\
 // RUN: FileCheck %s
 
 // ****************************** IMPORTANT NOTE ******************************
@@ -60,52 +59,54 @@ func.func private @microkernel(
     %init: tensor<4x4xf32>,
     %output: tensor<4x4xf32>) -> tensor<4x4xf32>
 
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  // Since the %arg2 handle is associated with both elementwise operations,
-  // we need to split it into two handles so we can target only the second
-  // elementwise operation.
-  %add, %max = transform.split_handle %arg2 : (!transform.op<"linalg.elemwise_binary">)
-      -> (!transform.any_op, !transform.any_op)
-
-  // The actual tiling transformation takes tile sizes as attributes. It produces a
-  // handle to the loop generated during tiling.
-  %tiled, %loop = transform.structured.tile_using_forall %max tile_sizes [8, 32]
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-
-  // We can now fuse the other operations into the loop. Here, we fuse
-  // operations one-by-one. This requires the operation that is being fused
-  // to define the value used within the loop, so the order of such fusions
-  // is important. We could also use "transform.merge_handles" to obtain
-  // a single handle to all operations and give it to `fuse_into_containing_op`
-  // that would take care of the ordering in this case.
-  %add_fused, %loop2 = transform.structured.fuse_into_containing_op %add into %loop
-      : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
-  %matmul_fused, %loop3 = transform.structured.fuse_into_containing_op %arg1 into %loop2
-      : (!transform.op<"linalg.matmul">, !transform.any_op) -> (!transform.any_op, !transform.any_op)
-
-  // Tile again to get the desired size. Note that this time this tiles the
-  // "add" operation and fuses matmul into the loop, but doesn't affect the
-  // "max" operation. This illustrates the precise targeting with the transform
-  // dialect. Otherwise, it is difficult to differentiate "add" and "max", both
-  // of which having the same kind.
-  %tiled_second, %loop_second = transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-  %matmul_fused_2, %loop_second_2 =
-      transform.structured.fuse_into_containing_op %matmul_fused into %loop_second
-      : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
-
-  // Since outlining is currently only implemented for region-holding operations
-  // such as loops, use tiling to size 1 to materialize the outer loop that is
-  // going to be outlined.
-  %_0, %loop_third = transform.structured.tile_using_forall %tiled_second tile_sizes [1]
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-  %_1, %outline_target = transform.structured.fuse_into_containing_op %matmul_fused_2 into %loop_third
-      : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
-  %func, %call = transform.loop.outline %outline_target {func_name = "outlined"}
-      : (!transform.any_op) -> (!transform.any_op, !transform.op<"func.call">)
-
-  transform.yield
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+      %arg0: !transform.any_op,
+      %arg1: !transform.op<"linalg.matmul">,
+      %arg2: !transform.op<"linalg.elemwise_binary">) {
+    // Since the %arg2 handle is associated with both elementwise operations,
+    // we need to split it into two handles so we can target only the second
+    // elementwise operation.
+    %add, %max = transform.split_handle %arg2 : (!transform.op<"linalg.elemwise_binary">)
+        -> (!transform.any_op, !transform.any_op)
+  
+    // The actual tiling transformation takes tile sizes as attributes. It produces a
+    // handle to the loop generated during tiling.
+    %tiled, %loop = transform.structured.tile_using_forall %max tile_sizes [8, 32]
+        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+  
+    // We can now fuse the other operations into the loop. Here, we fuse
+    // operations one-by-one. This requires the operation that is being fused
+    // to define the value used within the loop, so the order of such fusions
+    // is important. We could also use "transform.merge_handles" to obtain
+    // a single handle to all operations and give it to `fuse_into_containing_op`
+    // that would take care of the ordering in this case.
+    %add_fused, %loop2 = transform.structured.fuse_into_containing_op %add into %loop
+        : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
+    %matmul_fused, %loop3 = transform.structured.fuse_into_containing_op %arg1 into %loop2
+        : (!transform.op<"linalg.matmul">, !transform.any_op) -> (!transform.any_op, !transform.any_op)
+  
+    // Tile again to get the desired size. Note that this time this tiles the
+    // "add" operation and fuses matmul into the loop, but doesn't affect the
+    // "max" operation. This illustrates the precise targeting with the transform
+    // dialect. Otherwise, it is difficult to differentiate "add" and "max", both
+    // of which having the same kind.
+    %tiled_second, %loop_second = transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
+        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+    %matmul_fused_2, %loop_second_2 =
+        transform.structured.fuse_into_containing_op %matmul_fused into %loop_second
+        : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
+  
+    // Since outlining is currently only implemented for region-holding operations
+    // such as loops, use tiling to size 1 to materialize the outer loop that is
+    // going to be outlined.
+    %_0, %loop_third = transform.structured.tile_using_forall %tiled_second tile_sizes [1]
+        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+    %_1, %outline_target = transform.structured.fuse_into_containing_op %matmul_fused_2 into %loop_third
+        : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
+    %func, %call = transform.loop.outline %outline_target {func_name = "outlined"}
+        : (!transform.any_op) -> (!transform.any_op, !transform.op<"func.call">)
+  
+    transform.yield
+  }
 }
diff --git a/mlir/test/Examples/transform/Ch2/invalid.mlir b/mlir/test/Examples/transform/Ch2/invalid.mlir
index ad536832d9c52d..cb6738974fe272 100644
--- a/mlir/test/Examples/transform/Ch2/invalid.mlir
+++ b/mlir/test/Examples/transform/Ch2/invalid.mlir
@@ -1,11 +1,11 @@
-// RUN: transform-opt-ch2 %s --test-transform-dialect-interpreter --split-input-file --verify-diagnostics
+// RUN: transform-opt-ch2 %s --transform-interpreter --split-input-file \
+// RUN:                      --verify-diagnostics
 
 // expected-note @below {{offending payload}}
-module {
-  transform.sequence failures(propagate) {
-  ^bb0(%arg0: !transform.any_op):
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(%arg0: !transform.any_op) {
     // expected-error @below {{only applies to func.call payloads}}
     transform.my.change_call_target %arg0, "updated" : !transform.any_op
-    yield
+    transform.yield
   }
 }
diff --git a/mlir/test/Examples/transform/Ch2/ops.mlir b/mlir/test/Examples/transform/Ch2/ops.mlir
index d66f89b9ec8dd5..410a6e39a480c0 100644
--- a/mlir/test/Examples/transform/Ch2/ops.mlir
+++ b/mlir/test/Examples/transform/Ch2/ops.mlir
@@ -1,4 +1,4 @@
-// RUN: transform-opt-ch2 %s --test-transform-dialect-interpreter | FileCheck %s
+// RUN: transform-opt-ch2 %s --transform-interpreter | FileCheck %s
 
 // ****************************** IMPORTANT NOTE ******************************
 //
@@ -17,10 +17,11 @@ func.func @test() {
   return
 }
 
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op):
-  %call = transform.structured.match ops{["func.call"]} in %arg0 : (!transform.any_op) -> !transform.any_op
-  // CHECK: transform.my.change_call_target %{{.*}}, "updated" : !transform.any_op
-  transform.my.change_call_target %call, "updated" : !transform.any_op
-  transform.yield
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(%arg0: !transform.any_op) {
+    %call = transform.structured.match ops{["func.call"]} in %arg0 : (!transform.any_op) -> !transform.any_op
+    // CHECK: transform.my.change_call_target %{{.*}}, "updated" : !transform.any_op
+    transform.my.change_call_target %call, "updated" : !transform.any_op
+    transform.yield
+  }
 }
diff --git a/mlir/test/Examples/transform/Ch2/sequence.mlir b/mlir/test/Examples/transform/Ch2/sequence.mlir
index b6f32dc321efb0..976df1d55503a0 100644
--- a/mlir/test/Examples/transform/Ch2/sequence.mlir
+++ b/mlir/test/Examples/transform/Ch2/sequence.mlir
@@ -1,8 +1,7 @@
 // RUN: transform-opt-ch2 %s \
-// RUN:   --pass-pipeline="builtin.module(test-transform-dialect-interpreter{ \
-// RUN:        bind-first-extra-to-ops=linalg.matmul \
-// RUN:        bind-second-extra-to-ops=linalg.elemwise_binary \
-// RUN:        enable-expensive-checks},canonicalize,cse,symbol-dce)" |\
+// RUN:   --pass-pipeline="builtin.module(transform-interpreter{ \
+// RUN:        debug-bind-trailing-args=linalg.matmul,linalg.elemwise_binary},\
+// RUN:        canonicalize,cse,symbol-dce)" |\
 // RUN: FileCheck %s
 
 // ****************************** IMPORTANT NOTE ******************************
@@ -56,55 +55,57 @@ func.func private @microkernel(
     %init: tensor<4x4xf32>,
     %output: tensor<4x4xf32>) -> tensor<4x4xf32>
 
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  // Since the %arg2 handle is associated with both elementwise operations,
-  // we need to split it into two handles so we can target only the second
-  // elementwise operation.
-  %add, %max = transform.split_handle %arg2 : (!transform.op<"linalg.elemwise_binary">)
-      -> (!transform.any_op, !transform.any_op)
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+      %arg0: !transform.any_op,
+      %arg1: !transform.op<"linalg.matmul">,
+      %arg2: !transform.op<"linalg.elemwise_binary">) {
+    // Since the %arg2 handle is associated with both elementwise operations,
+    // we need to split it into two handles so we can target only the second
+    // elementwise operation.
+    %add, %max = transform.split_handle %arg2 : (!transform.op<"linalg.elemwise_binary">)
+        -> (!transform.any_op, !transform.any_op)
 
-  // The actual tiling transformation takes tile sizes as attributes. It produces a
-  // handle to the loop generated during tiling.
-  %tiled, %loop = transform.structured.tile_using_forall %max tile_sizes [8, 32]
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+    // The actual tiling transformation takes tile sizes as attributes. It produces a
+    // handle to the loop generated during tiling.
+    %tiled, %loop = transform.structured.tile_using_forall %max tile_sizes [8, 32]
+        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
 
-  // We can now fuse the other operations into the loop. Here, we fuse
-  // operations one-by-one. This requires the operation that is being fused
-  // to define the value used within the loop, so the order of such fusions
-  // is important. We could also use "transform.merge_handles" to obtain
-  // a single handle to all operations and give it to `fuse_into_containing_op`
-  // that would take care of the ordering in this case.
-  %add_fused, %loop2 = transform.structured.fuse_into_containing_op %add into %loop
-      : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
-  %matmul_fused, %loop3 = transform.structured.fuse_into_containing_op %arg1 into %loop2
-      : (!transform.op<"linalg.matmul">, !transform.any_op) -> (!transform.any_op, !transform.any_op)
+    // We can now fuse the other operations into the loop. Here, we fuse
+    // operations one-by-one. This requires the operation that is being fused
+    // to define the value used within the loop, so the order of such fusions
+    // is important. We could also use "transform.merge_handles" to obtain
+    // a single handle to all operations and give it to `fuse_into_containing_op`
+    // that would take care of the ordering in this case.
+    %add_fused, %loop2 = transform.structured.fuse_into_containing_op %add into %loop
+        : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
+    %matmul_fused, %loop3 = transform.structured.fuse_into_containing_op %arg1 into %loop2
+        : (!transform.op<"linalg.matmul">, !transform.any_op) -> (!transform.any_op, !transform.any_op)
 
-  // Tile again to get the desired size. Note that this time this tiles the
-  // "add" operation and fuses matmul into the loop, but doesn't affect the
-  // "max" operation. This illustrates the precise targeting with the transform
-  // dialect. Otherwise, it is difficult to differentiate "add" and "max", both
-  // of which having the same kind.
-  %tiled_second, %loop_second = transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-  %matmul_fused_2, %loop_second_2 =
-      transform.structured.fuse_into_containing_op %matmul_fused into %loop_second
-      : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
+    // Tile again to get the desired size. Note that this time this tiles the
+    // "add" operation and fuses matmul into the loop, but doesn't affect the
+    // "max" operation. This illustrates the precise targeting with the transform
+    // dialect. Otherwise, it is difficult to differentiate "add" and "max", both
+    // of which having the same kind.
+    %tiled_second, %loop_second = transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
+        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+    %matmul_fused_2, %loop_second_2 =
+        transform.structured.fuse_into_containing_op %matmul_fused into %loop_second
+        : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
 
-  // Since outlining is currently only implemented for region-holding operations
-  // such as loops, use tiling to size 1 to materialize the outer loop that is
-  // going to be outlined.
-  %_0, %loop_third = transform.structured.tile_using_forall %tiled_second tile_sizes [1]
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-  %_1, %outline_target = transform.structured.fuse_into_containing_op %matmul_fused_2 into %loop_third
-      : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
-  %func, %call = transform.loop.outline %outline_target {func_name = "outlined"}
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+    // Since outlining is currently only implemented for region-holding operations
+    // such as loops, use tiling to size 1 to materialize the outer loop that is
+    // going to be outlined.
+    %_0, %loop_third = transform.structured.tile_using_forall %tiled_second tile_sizes [1]
+        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+    %_1, %outline_target = transform.structured.fuse_into_containing_op %matmul_fused_2 into %loop_third
+        : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
+    %func, %call = transform.loop.outline %outline_target {func_name = "outlined"}
+        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
 
-  // Rewrite the call target.
-  transform.my.change_call_target %call, "microkernel" : !transform.any_op
+    // Rewrite the call target.
+    transform.my.change_call_target %call, "microkernel" : !transform.any_op
 
-  transform.yield
+    transform.yield
+  }
 }
diff --git a/mlir/test/Examples/transform/Ch3/invalid.mlir b/mlir/test/Examples/transform/Ch3/invalid.mlir
index 222629504fea66..acaabd5db30e40 100644
--- a/mlir/test/Examples/transform/Ch3/invalid.mlir
+++ b/mlir/test/Examples/transform/Ch3/invalid.mlir
@@ -1,10 +1,10 @@
-// RUN: transform-opt-ch3 %s --test-transform-dialect-interpreter --split-input-file --verify-diagnostics
+// RUN: transform-opt-ch3 %s --transform-interpreter --split-input-file --verify-diagnostics
 
 // expected-note @below {{offending operation}}
-module {
-  transform.sequence failures(suppress) {
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
   // expected-error @below {{expected the payload operation to implement CallOpInterface}}
-  ^bb0(%arg0: !transform.my.call_op_interface):
-    yield
+  %arg0: !transform.my.call_op_interface) {
+    transform.yield
   }
 }
diff --git a/mlir/test/Examples/transform/Ch3/ops.mlir b/mlir/test/Examples/transform/Ch3/ops.mlir
index f4170b8918bfe1..b2d47cc369a58b 100644
--- a/mlir/test/Examples/transform/Ch3/ops.mlir
+++ b/mlir/test/Examples/transform/Ch3/ops.mlir
@@ -1,4 +1,4 @@
-// RUN: transform-opt-ch3 %s --test-transform-dialect-interpreter \
+// RUN: transform-opt-ch3 %s --transform-interpreter \
 // RUN:   --allow-unregistered-dialect --split-input-file | FileCheck %s
 
 // ****************************** IMPORTANT NOTE ******************************
@@ -18,12 +18,13 @@ func.func @test1() {
   return
 }
 
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op):
-  %call = transform.structured.match ops{["func.call"]} in %arg0 : (!transform.any_op) -> !transform.op<"func.call">
-  // CHECK: transform.my.change_call_target %{{.*}}, "updated" : !transform.op<"func.call">
-  transform.my.change_call_target %call, "updated" : !transform.op<"func.call">
-  transform.yield
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(%arg0: !transform.any_op) {
+    %call = transform.structured.match ops{["func.call"]} in %arg0 : (!transform.any_op) -> !transform.op<"func.call">
+    // CHECK: transform.my.change_call_target %{{.*}}, "updated" : !transform.op<"func.call">
+    transform.my.change_call_target %call, "updated" : !transform.op<"func.call">
+    transform.yield
+  }
 }
 
 // -----
@@ -37,10 +38,11 @@ func.func @test2() {
   return
 }
 
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op):
-  %call = transform.structured.match ops{["func.call"]} in %arg0 : (!transform.any_op) -> !transform.my.call_op_interface
-  // CHECK: transform.my.call_to_op %{{.*}} : (!transform.my.call_op_interface) -> !transform.any_op
-  transform.my.call_to_op %call : (!transform.my.call_op_interface) -> !transform.any_op
-  transform.yield
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(%arg0: !transform.any_op) {
+    %call = transform.structured.match ops{["func.call"]} in %arg0 : (!transform.any_op) -> !transform.my.call_op_interface
+    // CHECK: transform.my.call_to_op %{{.*}} : (!transform.my.call_op_interface) -> !transform.any_op
+    transform.my.call_to_op %call : (!transform.my.call_op_interface) -> !transform.any_op
+    transform.yield
+  }
 }
diff --git a/mlir/test/Examples/transform/Ch3/sequence.mlir b/mlir/test/Examples/transform/Ch3/sequence.mlir
index 9dd46b347c5b81..8dc33c3560c26c 100644
--- a/mlir/test/Examples/transform/Ch3/sequence.mlir
+++ b/mlir/test/Examples/transform/Ch3/sequence.mlir
@@ -1,8 +1,7 @@
-// RUN: transform-opt-ch2 %s \
-// RUN:   --pass-pipeline="builtin.module(test-transform-dialect-interpreter{ \
-// RUN:        bind-first-extra-to-ops=linalg.matmul \
-// RUN:        bind-second-extra-to-ops=linalg.elemwise_binary \
-// RUN:        enable-expensive-checks},canonicalize,cse,symbol-dce)" |\
+// RUN: transform-opt-ch3 %s \
+// RUN:   --pass-pipeline="builtin.module(transform-interpreter{ \
+// RUN:        debug-bind-trailing-args=linalg.matmul,linalg.elemwise_binary},\
+// RUN:        canonicalize,cse,symbol-dce)" |\
 // RUN: FileCheck %s
 
 // ****************************** IMPORTANT NOTE ******************************
@@ -56,55 +55,57 @@ func.func private @microkernel(
     %init: tensor<4x4xf32>,
     %output: tensor<4x4xf32>) -> tensor<4x4xf32>
 
-transform.sequence failures(propagate) {
-^bb0(%arg0: !transform.any_op,
-     %arg1: !transform.op<"linalg.matmul">,
-     %arg2: !transform.op<"linalg.elemwise_binary">):
-  // Since the %arg2 handle is associated with both elementwise operations,
-  // we need to split it into two handles so we can target only the second
-  // elementwise operation.
-  %add, %max = transform.split_handle %arg2 : (!transform.op<"linalg.elemwise_binary">)
-      -> (!transform.any_op, !transform.any_op)
-
-  // The actual tiling transformation takes tile sizes as attributes. It produces a
-  // handle to the loop generated during tiling.
-  %tiled, %loop = transform.structured.tile_using_forall %max tile_sizes [8, 32]
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-
-  // We can now fuse the other operations into the loop. Here, we fuse
-  // operations one-by-one. This requires the operation that is being fused
-  // to define the value used within the loop, so the order of such fusions
-  // is important. We could also use "transform.merge_handles" to obtain
-  // a single handle to all operations and give it to `fuse_into_containing_op`
-  // that would take care of the ordering in this case.
-  %add_fused, %loop2 = transform.structured.fuse_into_containing_op %add into %loop
-      : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
-  %matmul_fused, %loop3 = transform.structured.fuse_into_containing_op %arg1 into %loop2
-      : (!transform.op<"linalg.matmul">, !transform.any_op) -> (!transform.any_op, !transform.any_op)
-
-  // Tile again to get the desired size. Note that this time this tiles the
-  // "add" operation and fuses matmul into the loop, but doesn't affect the
-  // "max" operation. This illustrates the precise targeting with the transform
-  // dialect. Otherwise, it is difficult to differentiate "add" and "max", both
-  // of which having the same kind.
-  %tiled_second, %loop_second = transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-  %matmul_fused_2, %loop_second_2 =
-      transform.structured.fuse_into_containing_op %matmul_fused into %loop_second
-      : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
-
-  // Since outlining is currently only implemented for region-holding operations
-  // such as loops, use tiling to size 1 to materialize the outer loop that is
-  // going to be outlined.
-  %_0, %loop_third = transform.structured.tile_using_forall %tiled_second tile_sizes [1]
-      : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
-  %_1, %outline_target = transform.structured.fuse_into_containing_op %matmul_fused_2 into %loop_third
-      : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
-  %func, %call = transform.loop.outline %outline_target {func_name = "outlined"}
-      : (!transform.any_op) -> (!transform.any_op, !transform.op<"func.call">)
-
-  // Rewrite the call target.
-  transform.my.change_call_target %call, "microkernel" : !transform.op<"func.call">
-
-  transform.yield
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(
+       %arg0: !transform.any_op,
+       %arg1: !transform.op<"linalg.matmul">,
+       %arg2: !transform.op<"linalg.elemwise_binary">) {
+    // Since the %arg2 handle is associated with both elementwise operations,
+    // we need to split it into two handles so we can target only the second
+    // elementwise operation.
+    %add, %max = transform.split_handle %arg2 : (!transform.op<"linalg.elemwise_binary">)
+        -> (!transform.any_op, !transform.any_op)
+  
+    // The actual tiling transformation takes tile sizes as attributes. It produces a
+    // handle to the loop generated during tiling.
+    %tiled, %loop = transform.structured.tile_using_forall %max tile_sizes [8, 32]
+        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+  
+    // We can now fuse the other operations into the loop. Here, we fuse
+    // operations one-by-one. This requires the operation that is being fused
+    // to define the value used within the loop, so the order of such fusions
+    // is important. We could also use "transform.merge_handles" to obtain
+    // a single handle to all operations and give it to `fuse_into_containing_op`
+    // that would take care of the ordering in this case.
+    %add_fused, %loop2 = transform.structured.fuse_into_containing_op %add into %loop
+        : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
+    %matmul_fused, %loop3 = transform.structured.fuse_into_containing_op %arg1 into %loop2
+        : (!transform.op<"linalg.matmul">, !transform.any_op) -> (!transform.any_op, !transform.any_op)
+  
+    // Tile again to get the desired size. Note that this time this tiles the
+    // "add" operation and fuses matmul into the loop, but doesn't affect the
+    // "max" operation. This illustrates the precise targeting with the transform
+    // dialect. Otherwise, it is difficult to differentiate "add" and "max", both
+    // of which having the same kind.
+    %tiled_second, %loop_second = transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
+        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+    %matmul_fused_2, %loop_second_2 =
+        transform.structured.fuse_into_containing_op %matmul_fused into %loop_second
+        : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
+  
+    // Since outlining is currently only implemented for region-holding operations
+    // such as loops, use tiling to size 1 to materialize the outer loop that is
+    // going to be outlined.
+    %_0, %loop_third = transform.structured.tile_using_forall %tiled_second tile_sizes [1]
+        : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+    %_1, %outline_target = transform.structured.fuse_into_containing_op %matmul_fused_2 into %loop_third
+        : (!transform.any_op, !transform.any_op) -> (!transform.any_op, !transform.any_op)
+    %func, %call = transform.loop.outline %outline_target {func_name = "outlined"}
+        : (!transform.any_op) -> (!transform.any_op, !transform.op<"func.call">)
+  
+    // Rewrite the call target.
+    transform.my.change_call_target %call, "microkernel" : !transform.op<"func.call">
+  
+    transform.yield
+  }
 }
diff --git a/mlir/test/Examples/transform/ChH/full.mlir b/mlir/test/Examples/transform/ChH/full.mlir
index d90d740b445312..f8d910370bc277 100644
--- a/mlir/test/Examples/transform/ChH/full.mlir
+++ b/mlir/test/Examples/transform/ChH/full.mlir
@@ -1,4 +1,4 @@
-// RUN: mlir-opt %s --test-transform-dialect-interpreter \
+// RUN: mlir-opt %s --transform-interpreter \
 // RUN:             --test-transform-dialect-erase-schedule \
 // RUN:             --math-uplift-to-fma \
 // RUN:             --convert-bufferization-to-memref \
@@ -115,9 +115,9 @@ module attributes { transform.with_named_sequence } {
   // have no effect on the Halide IR as of 294f80c49bf3bb8582446613c25fcce03b82.
   // Also note that the order of dimensions in Halide is inverted, e.g., co and
   // n are the outermost loops in the respective reorder directives.
-  transform.sequence failures(propagate) {
+  transform.named_sequence @__transform_main(
   // This argument will point to the top-level module.
-  ^bb0(%arg0: !transform.any_op):
+      %arg0: !transform.any_op) {
 
     // 1. Find the operations we are going to transform usnig their names. This
     // is a simplistic approach that works when there are few operations in the