[Mlir-commits] [mlir] [mlir][arith] Add documentation for the canonical form (PR #192845)

Matthias Springer llvmlistbot at llvm.org
Sun Apr 19 05:06:05 PDT 2026


https://github.com/matthias-springer created https://github.com/llvm/llvm-project/pull/192845

Assisted-by: claude-opus-4.7-thinking-high


>From c650210d9138eb5f6096c8266b5faa05052cb239 Mon Sep 17 00:00:00 2001
From: Matthias Springer <me at m-sp.org>
Date: Sun, 19 Apr 2026 12:04:55 +0000
Subject: [PATCH] [mlir][arith] Add documentation for the canonical form

---
 .../mlir/Dialect/Arith/IR/ArithBase.td        | 39 +++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/mlir/include/mlir/Dialect/Arith/IR/ArithBase.td b/mlir/include/mlir/Dialect/Arith/IR/ArithBase.td
index 985ae01008002..17fc47f1a73df 100644
--- a/mlir/include/mlir/Dialect/Arith/IR/ArithBase.td
+++ b/mlir/include/mlir/Dialect/Arith/IR/ArithBase.td
@@ -36,6 +36,45 @@ def Arith_Dialect : Dialect {
     canonicalization: round-to-nearest, ties-to-even. The runtime behavior of
     operations without an explicit rounding mode is deferred to the target
     backend and may differ from the default arith rounding mode.
+
+    ### Canonical form
+
+    `arith` IR is in canonical form if:
+
+    * Constant operands of commutative ops (`addi`, `muli`, ...) appear last.
+      `cmpi` mirrors this by swapping operands and inverting the predicate
+      (e.g., `cmpi slt, c, %x` -> `cmpi sgt, %x, c`).
+    * Constant subexpressions are folded. Folding is suppressed only when it
+      would introduce UB (div/rem by zero, signed-div overflow,
+      shift-by-bitwidth) or precision loss (`truncf`/`extf` between
+      non-representable float semantics).
+    * Constant operands are coalesced along chains. Chains of 
+      commutative/associative ops collapse into a single `op(%x, c)`
+      (e.g., `addi(addi(%x, c0), c1)` -> `addi(%x, c0+c1)`).
+    * Algebraic identities and absorbing elements are removed. `x+0`, `x*1`,
+      `x&allOnes`, `x^0`, `select(_, x, x)`, `min/max(x, x)`, etc. all
+      simplify away.
+    * Inverse-op chains/pairs cancel. Paired add/sub
+      (`addi(subi(a, b), b)` -> `a`), `(a*b)/b` -> `a` (with the matching
+      `nuw`/`nsw` flag), `xori(xori(x, a), a)` -> `x`, `negf(negf(x))` -> `x`,
+      `bitcast(bitcast(...))`, `ext(ext(...))` collapse, lossless
+      `trunc(ext(...))` collapses, and select-of-select on the same predicate
+      collapses.
+    * Among semantically equivalent ops, the simpler/narrower one wins.
+      Concretely:
+      * prefer `subi` over `addi` of a `*-1`;
+      * prefer `xori` (`not`) over `select` on boolean constants, and push
+        `not` into a `cmpi` predicate;
+      * push integer extensions outward through bitwise ops
+        (`xor(extui x, extui y)` -> `extui(xor x, y)`), so bitwise work
+        happens in the narrow type;
+      * fold redundant extensions into casts
+        (`sitofp(extui x)` -> `uitofp x`, `index_cast(extsi x)` ->
+        `index_cast x`);
+      * strip extensions inside equality `cmpi`.
+    * Dead "extended" results are dropped. `addui_extended`,
+      `mulsi_extended`, and `mului_extended` collapse to plain `addi`/`muli`
+      when the overflow/high result is unused.
   }];
 
   let hasConstantMaterializer = 1;



More information about the Mlir-commits mailing list