[Mlir-commits] [mlir] [MLIR][SCF] Sink scf.if from scf.while before region into after region. (PR #165216)

Thu Nov 6 16:00:17 PST 2025

MaheshRavishankar wrote:

I have tried to describe the issue I have with canonicalization too, most notably here https://discourse.llvm.org/t/rfc-update-to-general-design-section-of-operation-canonicalizations-in-mlir/79355.

But let me try to be more concrete, and maybe we need to move this to that old (or new) discourse post, but here is what canonicalizations should do afaics, a lot of which is driven by the fact that canonicalizations are applied everywhere in a given compiler stack and provide almost no control to users, which is the anti-pattern.

1) Canonicalizations should be aimed at moving particular operation into a more "canonical" representation to allow subsequent analysis easier to write. That I think everyone understands
2) The canonicalization itself today in MLIR have become a kitchen sink cause canonicalization patterns are added ad-hoc without any reference to how different canonicalization patterns interact. What is really needed is a class of fixed point iterations. Each fixed point iteration is trying to move the program further down the lattice. Once we know what the fixed point is doing, then we can easily determine what is the representation of that operation with respect to the lattice that is being reduced. A universal canonical form does not seem like a tractable goal. Today when a canonicalization pattern is added, it is not done with any analysis of how the pattern interacts with other canonicalization patterns, or what is the objective function that the canonicalizer is trying to optimize. It comes down to "I cant think of why someone would do anything else". That is fundamentally restricted by the understand of the author of all possible uses of particular operation and is in no way rigorous enough to justify that pattern kicking in everywhere in a compilation stack (without any control). I think it is completely justified ask for people adding something to a canonicalization, to justify why in all possible scenarios something is canonical, but that is not done today, and is is done here, instead people pushing back against it are asked to justify why something shouldnt be a canonicalization. That seems backwards to how basic proofs work. I cant just claim some mathematical fact. It needs to be proved. The proof cant be "no one gave me a counter-example, so I must be right".
3) Another piece of evidence of canonicalizations being used as kitchen sink is the huge compilations times canonicalizations take. Every canonicalization pass is just carrying a huge number patterns without any justification of why those patterns need to be considered as canonicalizations.

That btw, is exactly what I have said before here https://discourse.llvm.org/t/rfc-update-to-general-design-section-of-operation-canonicalizations-in-mlir/79355/31?u=maheshravishankar . I dont know how else to state it. As it stands, the proclivity of everyone just adding to canonicalization without justification or analysis is a bad practice being promoted which is ultimately going to be a major issue for anyone building a serious compiler stack using MLIR. 

Specific to this example, to me the movement of operations from the `while` region to the `do` region without any side-effect analysis seems like a red flag. The movement of operations itself is a compilation time overhead cause it will periodically trigger recomputation of the operation order within multiple blocks which is expensive depending on the program. Such factors should have already been accounted for when adding something to a canonicalization, but that is hardly ever done.

https://github.com/llvm/llvm-project/pull/165216