[Mlir-commits] [mlir] [mlir][vector] Add vector.transpose with unit-dim to vector.shape_cast pattern (PR #72105)

Wed Nov 22 15:13:31 PST 2023

joker-eph wrote:

> The map could be simplified here based on the known dim sizes, but it's not clear to me that it will always be easy to simplify to the same as what transpose gives in the context of larger IR + more complex layouts. Even here, simplification of this map might have trouble differentiating between an "important" and "unimportant" use of d0 (e.g. it might simplify to (d0, d1) -> (0, d1)`; maybe this is a more canonical map anyway, but it forces that decision).

Thanks @qedawkins for elaborating!
Seems like to me like the end results is exactly what's I'd expect: this the most canonical representation of the map (and is consistent with the shape_cast canonicalization). 
If the particular analysis you mention intends to support well "the vector dialect" then it ought to do a better job on shape_cast...
Otherwise it feels like what you're presenting is a situation of: "we implemented our analysis for a subset of the dialect, so when you're canonicalizing toward operation we haven't spend time on it regresses our analysis".

> but I do believe (1) is a more canonical representation. It is effectively carrying more information about what the program is doing. With shape_cast to understand that this is a transpose 

@MaheshRavishankar : you're missing the point. This **isn't a transpose** was the point. The fact that it is written as such in the original program isn't relevant for the purpose of canonicalization.
The "extra information about what the program is doing" is the wrong kind of information: it is exactly what a canonicalization is supposed to drop! Otherwise, the way you're presenting it can be applied to many canonicalization which are simplifying expressions and could be argued to "lose information about the original program".

> This is another indication that vector.transpose is actually a more canonical representation of the computation.

I see it differently: my take is that this is an indication that the abstraction and goals of the vector dialect aren't a perfect match for SPIRV and that SPIRV have to do "more work" to consume "vector".

> The conversion to shape_cast seems to be motivated by more efficient LLVM lowering, so it should be used on the LLVM lowering path.

I don't believe this is a correct read of the discussion here, or any of the arguments.

>  is since this would not happen at any point in the stack and now it is. So it is changing the behavior of an entire compiler stack. Moving to canonicalization have a high impact and should be done very deliberately. We've over-index on using canonicalizations for ease (I have done it many times do, and I am sure I will do it again accidentally). There are many cases where we run certain "foldings" multiple times, but they are all opt-in and I think that is fine. Canonicalizations are too heavy a hammer with almost unbounded side-effect.

If you've been adding "incorrect canonicalization because it's convenient", then please don't do that :)

Otherwise everything here seems completely backward and entirely disconnected from MLIR design. If you don't subscribe to the concepts of canonicalization, or you're wary of the impact on your downstream compiler: then stop using the canonicalization pass in your compiler!

>  I dont have an issue with that, except maybe it should be structured so that it is opt-out if needed (or opt-in), as it is here for SPIR-V lowering.

I strongly object to this line of reasoning: this isn't how the canonicalization of a dialect is designed.

>  1xN -> Nx1 is a transpose... it can also be "implemented" as a no-op shape_cast, but that is a lowering optimization, not a canonical representation.

Again: to me in absence of "information loss" (which hasn't been demonstrated in this thread), and operation that is "no data movement" looks more canonical. You're not really have anything to justify your claim here.

> this as a vector.transpose and lower it that way, which is effectively "undoing" a canonicalization. To me that is an indication that vector.transpose should be left as is

No: this is an indication that the vector dialect design is not optimizing for the SPIRV lowering. 
I wrote before: "making it possible to target SPIRV is a strong constraint that we share" (to some extent), that does not mean "we design this dialect to make it easy to lower to SPIRV". 

There is a reason I pushed back on some op to be included in some dialects: that's because that makes them candidate for such canonicalization by making them part of the same layer (for example the bufferization dialect was created).
Here you seem to want a "subset of the vector dialect" and that is just not reasonable as-is (that is: if this is what you want, that is an RFC for a **split** of the vector dialect in high-level and lower-level parts).

> So I'm still forced to do all the analysis to figure out it's a tranpose or no-op actually, as Mahesh said, effectively "undo" the canonicalzation, and as Quinn said, it's harder to rediscover certain information of vector.shape_cast comparing to just vector.transpose

Let's relativize here: this is a local match on a single op, it's not an "analysis".

>  I'd propose expose adding a populate*Patterns API entry to the pattern so Diego and others can have their workflows unblocked. I hope that's an okay intermediate step that buys us time

I have concern with this approach in that we likely **won't** revisit this: I don't see the forcing function to solve the issue with this.

https://github.com/llvm/llvm-project/pull/72105