[Mlir-commits] [mlir] [MLIR] Parallel loop fusion extended to interchanged loops. (PR #191245)
Ivan Butygin
llvmlistbot at llvm.org
Mon Apr 20 07:44:11 PDT 2026
================
----------------
Hardcode84 wrote:
I tried to poke on this more, but it quickly goes into "analyze affine expressions" route, the only relatively simple and cheap thing we can do is
```
The combinatorics aren't actually that bad if you gate by bound-equivalence classes first. Recap:
- Partition IVs by (lb, ub, step) triple.
- Permutations only exist within each class. Different classes are fixed-position.
- Total candidate space: ∏ (class_size)! — not N!.
In practice, class sizes are tiny. A nest with bounds (8, 8, 16, 32) has classes {size 2, size 1, size 1} → 2 permutations, not 24. A symmetric 3D nest (8, 8, 8) → 6
permutations. Anyone who writes a loop with a class of size ≥ 5 has bigger problems than your fusion pass.
So: class-bounded exhaustive, with an internal hard cap (not a user option).
static constexpr unsigned kMaxPermutations = 24; // 4!; anything larger is pathological
If the product-of-factorials exceeds this, bail to identity-only (current behavior).
```
But I'm fine with landing this is is (with an added TODO) and doing this in follow-up.
https://github.com/llvm/llvm-project/pull/191245
More information about the Mlir-commits
mailing list