[Mlir-commits] [mlir] [MLIR] Parallel loop fusion extended to interchanged loops. (PR #191245)

Mon Apr 20 07:44:11 PDT 2026

================

----------------
Hardcode84 wrote:

I tried to poke on this more, but it quickly goes into "analyze affine expressions" route, the only relatively simple and cheap thing we can do is
```
  The combinatorics aren't actually that bad if you gate by bound-equivalence classes first. Recap:

  - Partition IVs by (lb, ub, step) triple.
  - Permutations only exist within each class. Different classes are fixed-position.
  - Total candidate space: ∏ (class_size)! — not N!.

  In practice, class sizes are tiny. A nest with bounds (8, 8, 16, 32) has classes {size 2, size 1, size 1} → 2 permutations, not 24. A symmetric 3D nest (8, 8, 8) → 6
   permutations. Anyone who writes a loop with a class of size ≥ 5 has bigger problems than your fusion pass.

  So: class-bounded exhaustive, with an internal hard cap (not a user option).

  static constexpr unsigned kMaxPermutations = 24;  // 4!; anything larger is pathological

  If the product-of-factorials exceeds this, bail to identity-only (current behavior).
```

But I'm fine with landing this is is (with an added TODO) and doing this in follow-up.

https://github.com/llvm/llvm-project/pull/191245