[Mlir-commits] [mlir] [mlir] [linalg] Add pattern to swap transpose with broadcast (PR #97063)

Sun Jul 14 00:36:33 PDT 2024

cxy-1993 wrote:

> > After receiving multiple concerns about placing this pattern in canonicalize, I have reconsidered the validity of this pattern in canonicalize
> 
> can we elaborate on this? I am concerned about a proliferation of random patterns exposed by various APIs that aren’t canonicalization when they could be: can we record these « concerns » as a rationale why this is not a good canonicalization?

I think this is a very good example to add to @MaheshRavishankar discussion of whether an optimization should use canonicalization. When we have a consensus, I will update the documentation to standardize future behavior.

My previous concern was that putting this optimization in canonicalization might not provide positive benefits for all backends. However, as dcaballe and stellaraccident mentioned, the goal of canonicalization should not be to achieve maximum performance on every backend. The goal of canonicalization is to make subsequent optimizations more effective. (This should be the original definition of canonicalization https://github.com/w3c/charmod-norm/blob/gh-pages/index.html, and we should at least agree on this point). 

According to this definition, if we define the direction of the lattice changed by canonicalization to reduce redundant data in the IR, I think this pattern is appropriate behavior for canonicalization, because as I mentioned in my previous comments, this pattern is convergent and unidirectional, and it is not a one-off pattern --- we may have other patterns that will generate new opportunities for this pattern.

> The real issue here is that this pattern is a one-off. It doesn't fit within anything else. It's not a pattern in service of a larger transformation goal. So appears like a zombie method.

For example, we can combine other data redundant reduce pattern to reach:
`reduce(transpose(broadcast(input))) - > broadcast(transpose(reduce(input)))`

Based on the discussion above, I believe that this pattern can be considered canonicalization, but we need to define the lattice and the direction of canonicalization carefully. Can we reach a consensus on this? @stellaraccident @MaheshRavishankar @dcaballe 

https://github.com/llvm/llvm-project/pull/97063