[Mlir-commits] [mlir] [mlir][docs] Add more examples for the "canonical form" (PR #173667)
Matthias Springer
llvmlistbot at llvm.org
Mon Dec 29 08:11:08 PST 2025
https://github.com/matthias-springer updated https://github.com/llvm/llvm-project/pull/173667
>From e797616ef9e3e867e3f41b17ea38489f45a7db6b Mon Sep 17 00:00:00 2001
From: Matthias Springer <me at m-sp.org>
Date: Fri, 26 Dec 2025 14:33:38 +0000
Subject: [PATCH 1/5] [mlir][docs] Add more examples for the "canonical form"
---
mlir/docs/Canonicalization.md | 54 +++++++++++++++++++----------------
1 file changed, 30 insertions(+), 24 deletions(-)
diff --git a/mlir/docs/Canonicalization.md b/mlir/docs/Canonicalization.md
index 2622c08e535fe..29fbc02a478cd 100644
--- a/mlir/docs/Canonicalization.md
+++ b/mlir/docs/Canonicalization.md
@@ -63,30 +63,36 @@ Some important things to think about w.r.t. canonicalization patterns:
* Canonicalize shouldn't lose the semantic of original operation: the original
information should always be recoverable from the transformed IR.
-For example, a pattern that transform
-
-```
- %transpose = linalg.transpose
- ins(%input : tensor<1x2x3xf32>)
- outs(%init1 : tensor<2x1x3xf32>)
- dimensions = [1, 0, 2]
- %out = linalg.transpose
- ins(%transpose: tensor<2x1x3xf32>)
- outs(%init2 : tensor<3x1x2xf32>)
- permutation = [2, 1, 0]
-```
-
-to
-
-```
- %out= linalg.transpose
- ins(%input : tensor<1x2x3xf32>)
- outs(%init2: tensor<3x1x2xf32>)
- permutation = [2, 0, 1]
-```
-
-is a good canonicalization pattern because it removes a redundant operation,
-making other analysis optimizations and more efficient.
+## What is a Canonical Form?
+
+There is no formally defined canonical form in MLIR. The de-facto canonical
+form keeps evolving, as canonicalization patterns and folders are getting
+added / removed / modified by the community.
+
+The canonicalizer pass is integral to many downstream projects but offers no
+fine-grained control over individual patterns or foldings, making changes to
+the canonical form potentially contentious. Whether a transformation belongs
+in the canonical form must be decided on a case-by-case basis, but common
+community-agreed canonicalizations include:
+
+* Identity / no-op elimination. E.g., folding `arith.addi %x, %c0` to `%x` or
+ erasing `memref.copy %x, %x`.
+* Constant folding. E.g., folding `arith.addi %c1, %c2` to `%c3`.
+* Folding inverse ops. E.g., folding `arith.xori(arith.xori(%x, %a), %a)` to
+ `%x`.
+* Unused/redundant value elimination. E.g., removing unused loop-carried
+ variables of an `scf.for` op or removing redundant `scf.if` results (when
+ both branches yield the same value).
+* Trivial control flow simplications. E.g., inlining the "then" body of an
+ `scf.if %true` op and erasing the `scf.if` op.
+* Folding chained metadata / shape ops of the same type. E.g., replacing
+ `linalg.transpose(linalg.transpose(%x))` with a single `linalg.transpose(%x)`.
+* Dynamic to static type refinement such as folding constant sizes into
+ shaped types. E.g., rewriting `%v = tensor.empty(%c5) : tensor<?xf32>` as
+ `%0 = tensor.empty() : tensor<5xf32>` and
+ `%v = tensor.cast %0 : tensor<5xf32> to tensor<?xf32>`.
+* Cast propagation / folding. E.g., pushing casts through operations or folding
+ them away if it introduces more static type information.
## Globally Applied Rules
>From 1f54d2653c61b6d7382a893c90d686f77513651a Mon Sep 17 00:00:00 2001
From: Matthias Springer <me at m-sp.org>
Date: Sat, 27 Dec 2025 08:34:35 +0000
Subject: [PATCH 2/5] address comments: link to Linalg forms
---
mlir/docs/Canonicalization.md | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/mlir/docs/Canonicalization.md b/mlir/docs/Canonicalization.md
index 29fbc02a478cd..b571ae7d1a92e 100644
--- a/mlir/docs/Canonicalization.md
+++ b/mlir/docs/Canonicalization.md
@@ -63,11 +63,12 @@ Some important things to think about w.r.t. canonicalization patterns:
* Canonicalize shouldn't lose the semantic of original operation: the original
information should always be recoverable from the transformed IR.
-## What is a Canonical Form?
+## What is the Canonical Form?
-There is no formally defined canonical form in MLIR. The de-facto canonical
-form keeps evolving, as canonicalization patterns and folders are getting
-added / removed / modified by the community.
+There is no single formally defined canonical form in MLIR. Some dialects
+define multiple forms, depending on the transformation ([example](https://mlir.llvm.org/docs/Rationale/RationaleLinalgDialect/#interchangeability-of-formsa-nameformsa)).
+The de-facto canonical form keeps evolving, as canonicalization patterns and
+folders are getting added / removed / modified by the community.
The canonicalizer pass is integral to many downstream projects but offers no
fine-grained control over individual patterns or foldings, making changes to
>From b60faa3181efa50db1960c75867f76aed62b00d5 Mon Sep 17 00:00:00 2001
From: Matthias Springer <me at m-sp.org>
Date: Sun, 28 Dec 2025 18:54:10 +0100
Subject: [PATCH 3/5] Apply suggestions from code review
Co-authored-by: Mehdi Amini <joker.eph at gmail.com>
---
mlir/docs/Canonicalization.md | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/mlir/docs/Canonicalization.md b/mlir/docs/Canonicalization.md
index b571ae7d1a92e..20eb4a08bb684 100644
--- a/mlir/docs/Canonicalization.md
+++ b/mlir/docs/Canonicalization.md
@@ -65,8 +65,7 @@ Some important things to think about w.r.t. canonicalization patterns:
## What is the Canonical Form?
-There is no single formally defined canonical form in MLIR. Some dialects
-define multiple forms, depending on the transformation ([example](https://mlir.llvm.org/docs/Rationale/RationaleLinalgDialect/#interchangeability-of-formsa-nameformsa)).
+There is no formally defined canonical form in MLIR.
The de-facto canonical form keeps evolving, as canonicalization patterns and
folders are getting added / removed / modified by the community.
@@ -78,7 +77,7 @@ community-agreed canonicalizations include:
* Identity / no-op elimination. E.g., folding `arith.addi %x, %c0` to `%x` or
erasing `memref.copy %x, %x`.
-* Constant folding. E.g., folding `arith.addi %c1, %c2` to `%c3`.
+* Scalar constant folding. E.g., folding `arith.addi %c1, %c2` to `%c3`. Note: this isn't true for "large" tensors where constant folding can lead to an IR-size explosion.
* Folding inverse ops. E.g., folding `arith.xori(arith.xori(%x, %a), %a)` to
`%x`.
* Unused/redundant value elimination. E.g., removing unused loop-carried
>From 86a8c15f163e7bbdfdc22a58bd9a06e88ad39a7f Mon Sep 17 00:00:00 2001
From: Matthias Springer <me at m-sp.org>
Date: Sun, 28 Dec 2025 18:03:05 +0000
Subject: [PATCH 4/5] address comments
---
mlir/docs/Canonicalization.md | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/mlir/docs/Canonicalization.md b/mlir/docs/Canonicalization.md
index 20eb4a08bb684..734ae6ac5993f 100644
--- a/mlir/docs/Canonicalization.md
+++ b/mlir/docs/Canonicalization.md
@@ -65,19 +65,21 @@ Some important things to think about w.r.t. canonicalization patterns:
## What is the Canonical Form?
-There is no formally defined canonical form in MLIR.
-The de-facto canonical form keeps evolving, as canonicalization patterns and
-folders are getting added / removed / modified by the community.
+There is no formally defined canonical form in MLIR. The de-facto canonical
+form keeps evolving, as canonicalization patterns and folders are getting
+added / removed / modified by the community.
-The canonicalizer pass is integral to many downstream projects but offers no
-fine-grained control over individual patterns or foldings, making changes to
-the canonical form potentially contentious. Whether a transformation belongs
-in the canonical form must be decided on a case-by-case basis, but common
-community-agreed canonicalizations include:
+The canonicalizer pass is used in many projects but does not offer fine-grained
+control over individual patterns or foldings, making changes to the canonical
+form potentially contentious. Whether a transformation belongs in the canonical
+form must be decided on a case-by-case basis, but common community-agreed
+canonicalizations include:
* Identity / no-op elimination. E.g., folding `arith.addi %x, %c0` to `%x` or
erasing `memref.copy %x, %x`.
-* Scalar constant folding. E.g., folding `arith.addi %c1, %c2` to `%c3`. Note: this isn't true for "large" tensors where constant folding can lead to an IR-size explosion.
+* Scalar constant folding. E.g., folding `arith.addi %c1, %c2` to `%c3`. Note:
+ this is not true for "large" tensors where constant folding can lead to IR
+ size explosion.
* Folding inverse ops. E.g., folding `arith.xori(arith.xori(%x, %a), %a)` to
`%x`.
* Unused/redundant value elimination. E.g., removing unused loop-carried
@@ -94,6 +96,10 @@ community-agreed canonicalizations include:
* Cast propagation / folding. E.g., pushing casts through operations or folding
them away if it introduces more static type information.
+Note: Some dialects define multiple IR forms, sometimes depending on the
+follow-up transformation ([example](https://mlir.llvm.org/docs/Rationale/RationaleLinalgDialect/#interchangeability-of-formsa-nameformsa)).
+These forms are unrelated to MLIR's canonicalization mechanism.
+
## Globally Applied Rules
These transformations are applied to all levels of IR:
>From 9b65f28b6cb0a2249d03807a0f940a983baebf03 Mon Sep 17 00:00:00 2001
From: Matthias Springer <me at m-sp.org>
Date: Mon, 29 Dec 2025 16:10:45 +0000
Subject: [PATCH 5/5] address comments
---
mlir/docs/Canonicalization.md | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/mlir/docs/Canonicalization.md b/mlir/docs/Canonicalization.md
index 734ae6ac5993f..6f48e60c94962 100644
--- a/mlir/docs/Canonicalization.md
+++ b/mlir/docs/Canonicalization.md
@@ -75,11 +75,9 @@ form potentially contentious. Whether a transformation belongs in the canonical
form must be decided on a case-by-case basis, but common community-agreed
canonicalizations include:
-* Identity / no-op elimination. E.g., folding `arith.addi %x, %c0` to `%x` or
- erasing `memref.copy %x, %x`.
-* Scalar constant folding. E.g., folding `arith.addi %c1, %c2` to `%c3`. Note:
- this is not true for "large" tensors where constant folding can lead to IR
- size explosion.
+* Identity / no-op elimination. E.g., folding `arith.addi(%x, %c0)` to `%x` or
+ erasing `memref.copy(%x, %x)`.
+* Scalar constant folding. E.g., folding `arith.addi(%c1, %c2)` to `%c3`.
* Folding inverse ops. E.g., folding `arith.xori(arith.xori(%x, %a), %a)` to
`%x`.
* Unused/redundant value elimination. E.g., removing unused loop-carried
@@ -93,8 +91,15 @@ canonicalizations include:
shaped types. E.g., rewriting `%v = tensor.empty(%c5) : tensor<?xf32>` as
`%0 = tensor.empty() : tensor<5xf32>` and
`%v = tensor.cast %0 : tensor<5xf32> to tensor<?xf32>`.
-* Cast propagation / folding. E.g., pushing casts through operations or folding
- them away if it introduces more static type information.
+* Cast propagation / folding such as pushing casts through operations or
+ folding them away if it introduces more static type information. E.g.,
+ rewriting `tensor.insert_slice(%src, tensor.cast(%dst))` (where the cast
+ converts from `tensor<5xf32>` to `tensor<?xf32>`) as
+ `tensor.cast(tensor.insert_slice(%src, %dst))`.
+
+
+Note: Some canonicalizations do not apply when they would lead to IR size
+explosion. (E.g., when they would produce "large" tensor/vector attributes.)
Note: Some dialects define multiple IR forms, sometimes depending on the
follow-up transformation ([example](https://mlir.llvm.org/docs/Rationale/RationaleLinalgDialect/#interchangeability-of-formsa-nameformsa)).
More information about the Mlir-commits
mailing list