[Mlir-commits] [mlir] [mlir][arith] Add documentation for the canonical form (PR #192845)
Matthias Springer
llvmlistbot at llvm.org
Sun Apr 19 05:15:38 PDT 2026
https://github.com/matthias-springer updated https://github.com/llvm/llvm-project/pull/192845
>From 5b11787ed724fde14dc57642d12a191b954ce189 Mon Sep 17 00:00:00 2001
From: Matthias Springer <me at m-sp.org>
Date: Sun, 19 Apr 2026 12:04:55 +0000
Subject: [PATCH] [mlir][arith] Add documentation for the canonical form
---
.../mlir/Dialect/Arith/IR/ArithBase.td | 38 +++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/mlir/include/mlir/Dialect/Arith/IR/ArithBase.td b/mlir/include/mlir/Dialect/Arith/IR/ArithBase.td
index 985ae01008002..0f37c53a43ca2 100644
--- a/mlir/include/mlir/Dialect/Arith/IR/ArithBase.td
+++ b/mlir/include/mlir/Dialect/Arith/IR/ArithBase.td
@@ -36,6 +36,44 @@ def Arith_Dialect : Dialect {
canonicalization: round-to-nearest, ties-to-even. The runtime behavior of
operations without an explicit rounding mode is deferred to the target
backend and may differ from the default arith rounding mode.
+
+ ### Canonical form
+
+ `arith` IR is in canonical form if:
+
+ * Constant operands of commutative ops (`addi`, `muli`, ...) appear last.
+ `cmpi` mirrors this by swapping operands and inverting the predicate
+ (e.g., `cmpi slt, c, %x` -> `cmpi sgt, %x, c`).
+ * Constant subexpressions are folded. Folding is suppressed when the
+ original IR has undefined behavior (div/rem by zero, signed-div overflow,
+ shift-by-bitwidth) or the folded IR suffers a loss of precision loss
+ (`truncf`/`extf` between non-representable float semantics). Poisoned
+ constant values are folded to `ub.poison`.
+ * Constant operands are coalesced along chains. Chains of
+ commutative/associative ops collapse into a single `op(%x, c)`
+ (e.g., `addi(addi(%x, c0), c1)` -> `addi(%x, c0+c1)`).
+ * Algebraic identities and absorbing elements are removed. E.g., `x+0`,
+ `x*1`, `x&allOnes`, `x^0`, `select(_, x, x)`, `min/max(x, x)`, etc. all
+ simplify away.
+ * Inverse-op chains/pairs are folded away. E.g.,
+ `addi(subi(a, b), b)` -> `a`, `xori(xori(x, a), a)` -> `x`,
+ `negf(negf(x))` -> `x`, `bitcast(bitcast(...))`, `ext(ext(...))` collapses,
+ lossless `trunc(ext(...))` collapses.
+ * Among semantically equivalent ops, the simpler/narrower one wins.
+ Concretely:
+ * prefer `subi` over `addi` of a `*-1`;
+ * prefer `xori` (`not`) over `select` on boolean constants, and push
+ `not` into a `cmpi` predicate;
+ * push integer extensions outward through bitwise ops
+ (`xor(extui x, extui y)` -> `extui(xor x, y)`), so bitwise work
+ happens in the narrow type;
+ * fold redundant extensions into casts
+ (`sitofp(extui x)` -> `uitofp x`, `index_cast(extsi x)` ->
+ `index_cast x`);
+ * strip extensions inside equality `cmpi`.
+ * Dead "extended" results are dropped. `addui_extended`,
+ `mulsi_extended`, and `mului_extended` collapse to plain `addi`/`muli`
+ when the overflow/high result is unused.
}];
let hasConstantMaterializer = 1;
More information about the Mlir-commits
mailing list