[Mlir-commits] [mlir] abfac95 - [mlir][docs] Add more examples for the "canonical form" (#173667)

Tue Dec 30 00:24:11 PST 2025

Author: Matthias Springer
Date: 2025-12-30T09:24:07+01:00
New Revision: abfac951432225ceca6b5a1a6e4addc353650492

URL: https://github.com/llvm/llvm-project/commit/abfac951432225ceca6b5a1a6e4addc353650492
DIFF: https://github.com/llvm/llvm-project/commit/abfac951432225ceca6b5a1a6e4addc353650492.diff

LOG: [mlir][docs] Add more examples for the "canonical form" (#173667)

Mention that there is no formal definition of the canonical form. Also
add more examples for users to understand what kind of transformations
the community has agreed upon in the past.

---------

Co-authored-by: Mehdi Amini <joker.eph at gmail.com>

Added: 
    

Modified: 
    mlir/docs/Canonicalization.md

Removed: 
    


################################################################################
diff  --git a/mlir/docs/Canonicalization.md b/mlir/docs/Canonicalization.md
index 2622c08e535fe..6f48e60c94962 100644

--- a/mlir/docs/Canonicalization.md
+++ b/mlir/docs/Canonicalization.md
@@ -63,30 +63,47 @@ Some important things to think about w.r.t. canonicalization patterns:
 *   Canonicalize shouldn't lose the semantic of original operation: the original
     information should always be recoverable from the transformed IR.
 
-For example, a pattern that transform
-
-```
-  %transpose = linalg.transpose
-      ins(%input : tensor<1x2x3xf32>)
-      outs(%init1 : tensor<2x1x3xf32>)
-      dimensions = [1, 0, 2]
-  %out = linalg.transpose
-      ins(%transpose: tensor<2x1x3xf32>)
-      outs(%init2 : tensor<3x1x2xf32>)
-      permutation = [2, 1, 0]
-```
-
-to
-
-```
-  %out= linalg.transpose
-      ins(%input : tensor<1x2x3xf32>)
-      outs(%init2: tensor<3x1x2xf32>)
-      permutation = [2, 0, 1]
-```
-
-is a good canonicalization pattern because it removes a redundant operation,
-making other analysis optimizations and more efficient.
+## What is the Canonical Form?
+
+There is no formally defined canonical form in MLIR. The de-facto canonical
+form keeps evolving, as canonicalization patterns and folders are getting
+added / removed / modified by the community.
+
+The canonicalizer pass is used in many projects but does not offer fine-grained
+control over individual patterns or foldings, making changes to the canonical
+form potentially contentious. Whether a transformation belongs in the canonical
+form must be decided on a case-by-case basis, but common community-agreed
+canonicalizations include:
+
+* Identity / no-op elimination. E.g., folding `arith.addi(%x, %c0)` to `%x` or
+  erasing `memref.copy(%x, %x)`.
+* Scalar constant folding. E.g., folding `arith.addi(%c1, %c2)` to `%c3`.
+* Folding inverse ops. E.g., folding `arith.xori(arith.xori(%x, %a), %a)` to
+  `%x`.
+* Unused/redundant value elimination. E.g., removing unused loop-carried
+  variables of an `scf.for` op or removing redundant `scf.if` results (when
+  both branches yield the same value).
+* Trivial control flow simplications. E.g., inlining the "then" body of an
+  `scf.if %true` op and erasing the `scf.if` op.
+* Folding chained metadata / shape ops of the same type. E.g., replacing
+  `linalg.transpose(linalg.transpose(%x))` with a single `linalg.transpose(%x)`.
+* Dynamic to static type refinement such as folding constant sizes into
+  shaped types. E.g., rewriting `%v = tensor.empty(%c5) : tensor<?xf32>` as
+  `%0 = tensor.empty() : tensor<5xf32>` and
+  `%v = tensor.cast %0 : tensor<5xf32> to tensor<?xf32>`.
+* Cast propagation / folding such as pushing casts through operations or
+  folding them away if it introduces more static type information. E.g.,
+  rewriting `tensor.insert_slice(%src, tensor.cast(%dst))` (where the cast
+  converts from `tensor<5xf32>` to `tensor<?xf32>`) as
+  `tensor.cast(tensor.insert_slice(%src, %dst))`.
+
+
+Note: Some canonicalizations do not apply when they would lead to IR size
+explosion. (E.g., when they would produce "large" tensor/vector attributes.)
+
+Note: Some dialects define multiple IR forms, sometimes depending on the
+follow-up transformation ([example](https://mlir.llvm.org/docs/Rationale/RationaleLinalgDialect/#interchangeability-of-formsa-nameformsa)).
+These forms are unrelated to MLIR's canonicalization mechanism.
 
 ## Globally Applied Rules