[Mlir-commits] [mlir] [mlir][docs] Add more examples for the "canonical form" (PR #173667)

Matthias Springer llvmlistbot at llvm.org
Sun Dec 28 09:54:18 PST 2025


https://github.com/matthias-springer updated https://github.com/llvm/llvm-project/pull/173667

>From e797616ef9e3e867e3f41b17ea38489f45a7db6b Mon Sep 17 00:00:00 2001
From: Matthias Springer <me at m-sp.org>
Date: Fri, 26 Dec 2025 14:33:38 +0000
Subject: [PATCH 1/3] [mlir][docs] Add more examples for the "canonical form"

---
 mlir/docs/Canonicalization.md | 54 +++++++++++++++++++----------------
 1 file changed, 30 insertions(+), 24 deletions(-)

diff --git a/mlir/docs/Canonicalization.md b/mlir/docs/Canonicalization.md
index 2622c08e535fe..29fbc02a478cd 100644
--- a/mlir/docs/Canonicalization.md
+++ b/mlir/docs/Canonicalization.md
@@ -63,30 +63,36 @@ Some important things to think about w.r.t. canonicalization patterns:
 *   Canonicalize shouldn't lose the semantic of original operation: the original
     information should always be recoverable from the transformed IR.
 
-For example, a pattern that transform
-
-```
-  %transpose = linalg.transpose
-      ins(%input : tensor<1x2x3xf32>)
-      outs(%init1 : tensor<2x1x3xf32>)
-      dimensions = [1, 0, 2]
-  %out = linalg.transpose
-      ins(%transpose: tensor<2x1x3xf32>)
-      outs(%init2 : tensor<3x1x2xf32>)
-      permutation = [2, 1, 0]
-```
-
-to
-
-```
-  %out= linalg.transpose
-      ins(%input : tensor<1x2x3xf32>)
-      outs(%init2: tensor<3x1x2xf32>)
-      permutation = [2, 0, 1]
-```
-
-is a good canonicalization pattern because it removes a redundant operation,
-making other analysis optimizations and more efficient.
+## What is a Canonical Form?
+
+There is no formally defined canonical form in MLIR. The de-facto canonical
+form keeps evolving, as canonicalization patterns and folders are getting
+added / removed / modified by the community.
+
+The canonicalizer pass is integral to many downstream projects but offers no
+fine-grained control over individual patterns or foldings, making changes to
+the canonical form potentially contentious. Whether a transformation belongs
+in the canonical form must be decided on a case-by-case basis, but common
+community-agreed canonicalizations include:
+
+* Identity / no-op elimination. E.g., folding `arith.addi %x, %c0` to `%x` or
+  erasing `memref.copy %x, %x`.
+* Constant folding. E.g., folding `arith.addi %c1, %c2` to `%c3`.
+* Folding inverse ops. E.g., folding `arith.xori(arith.xori(%x, %a), %a)` to
+  `%x`.
+* Unused/redundant value elimination. E.g., removing unused loop-carried
+  variables of an `scf.for` op or removing redundant `scf.if` results (when
+  both branches yield the same value).
+* Trivial control flow simplications. E.g., inlining the "then" body of an
+  `scf.if %true` op and erasing the `scf.if` op.
+* Folding chained metadata / shape ops of the same type. E.g., replacing
+  `linalg.transpose(linalg.transpose(%x))` with a single `linalg.transpose(%x)`.
+* Dynamic to static type refinement such as folding constant sizes into
+  shaped types. E.g., rewriting `%v = tensor.empty(%c5) : tensor<?xf32>` as
+  `%0 = tensor.empty() : tensor<5xf32>` and
+  `%v = tensor.cast %0 : tensor<5xf32> to tensor<?xf32>`.
+* Cast propagation / folding. E.g., pushing casts through operations or folding
+  them away if it introduces more static type information.
 
 ## Globally Applied Rules
 

>From 1f54d2653c61b6d7382a893c90d686f77513651a Mon Sep 17 00:00:00 2001
From: Matthias Springer <me at m-sp.org>
Date: Sat, 27 Dec 2025 08:34:35 +0000
Subject: [PATCH 2/3] address comments: link to Linalg forms

---
 mlir/docs/Canonicalization.md | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mlir/docs/Canonicalization.md b/mlir/docs/Canonicalization.md
index 29fbc02a478cd..b571ae7d1a92e 100644
--- a/mlir/docs/Canonicalization.md
+++ b/mlir/docs/Canonicalization.md
@@ -63,11 +63,12 @@ Some important things to think about w.r.t. canonicalization patterns:
 *   Canonicalize shouldn't lose the semantic of original operation: the original
     information should always be recoverable from the transformed IR.
 
-## What is a Canonical Form?
+## What is the Canonical Form?
 
-There is no formally defined canonical form in MLIR. The de-facto canonical
-form keeps evolving, as canonicalization patterns and folders are getting
-added / removed / modified by the community.
+There is no single formally defined canonical form in MLIR. Some dialects
+define multiple forms, depending on the transformation ([example](https://mlir.llvm.org/docs/Rationale/RationaleLinalgDialect/#interchangeability-of-formsa-nameformsa)).
+The de-facto canonical form keeps evolving, as canonicalization patterns and
+folders are getting added / removed / modified by the community.
 
 The canonicalizer pass is integral to many downstream projects but offers no
 fine-grained control over individual patterns or foldings, making changes to

>From b60faa3181efa50db1960c75867f76aed62b00d5 Mon Sep 17 00:00:00 2001
From: Matthias Springer <me at m-sp.org>
Date: Sun, 28 Dec 2025 18:54:10 +0100
Subject: [PATCH 3/3] Apply suggestions from code review

Co-authored-by: Mehdi Amini <joker.eph at gmail.com>
---
 mlir/docs/Canonicalization.md | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/mlir/docs/Canonicalization.md b/mlir/docs/Canonicalization.md
index b571ae7d1a92e..20eb4a08bb684 100644
--- a/mlir/docs/Canonicalization.md
+++ b/mlir/docs/Canonicalization.md
@@ -65,8 +65,7 @@ Some important things to think about w.r.t. canonicalization patterns:
 
 ## What is the Canonical Form?
 
-There is no single formally defined canonical form in MLIR. Some dialects
-define multiple forms, depending on the transformation ([example](https://mlir.llvm.org/docs/Rationale/RationaleLinalgDialect/#interchangeability-of-formsa-nameformsa)).
+There is no formally defined canonical form in MLIR.
 The de-facto canonical form keeps evolving, as canonicalization patterns and
 folders are getting added / removed / modified by the community.
 
@@ -78,7 +77,7 @@ community-agreed canonicalizations include:
 
 * Identity / no-op elimination. E.g., folding `arith.addi %x, %c0` to `%x` or
   erasing `memref.copy %x, %x`.
-* Constant folding. E.g., folding `arith.addi %c1, %c2` to `%c3`.
+* Scalar constant folding. E.g., folding `arith.addi %c1, %c2` to `%c3`. Note: this isn't true for "large" tensors where constant folding can lead to an IR-size explosion.
 * Folding inverse ops. E.g., folding `arith.xori(arith.xori(%x, %a), %a)` to
   `%x`.
 * Unused/redundant value elimination. E.g., removing unused loop-carried



More information about the Mlir-commits mailing list