[Mlir-commits] [mlir] [mlir][vector] Add vector.transpose with unit-dim to vector.shape_cast pattern (PR #72105)

Thu Nov 23 20:08:42 PST 2023

joker-eph wrote:

Thanks @qedawkins, I appreciate greatly the exchange here: this really helps getting to the core of what is important for the vector dialect and how to think about canonicalization one way or the other for "shape_cast". The part about "Vector dialect, by virtue [..] is representing looped computations" could be an argument anchored in the Dialect goals in favor of "transpose".

> There is no general pattern for folding shape_cast into a contraction, 

Nitpicking, you likely meant that "there is no general pattern for folding **arbitrary** shape_cast into a contraction". Because this particular shape cast (or any shape cast that would fit your requirement) definitely can.

> So if I want to recover the same IR, I have to write a transformation that special cases shape_casts of unit dimensions 

Yes.

> ... that seems fairly arbitrary to me ...
>  I am definitely not arguing it should be the general choice here, but rather that the same is true of shape_cast. 

Right (I assume my quote is correct?),  this is why I have tried to anchor the discussion on "which one is more canonical" based on something like "one is more restricted than the other because of the no-data-movement guarantee".
There is the question of whether we lose "structure" in the process, but that seems like a question of analysis being able to track similar affine maps from the shape_cast and the transpose (that is: it seems to me more about QoI of the analysis than a fundamental question of canonicalization).

I admit that for such analyses that just want to reason about loops and affine maps, shape_cast is getting in the way by nature. I feel that this is because the analysis is fundamentally trying to track "data movement" in the vector across operations? I'm not sure what would be the convenient way of integrating "shape cast" in a more principled way in such a system?

> Vector dialect, by virtue of representing virtualized "super vectors," is representing looped computations (transfer_read, transfer_write, contract) and the extra permutation info on transpose is convenient when adjacent to such ops. 

This is a good point: from this point of view shape_cast could be seen as "less structured" (because it's not described as loops. Actually could it be described as such in all cases? I think so: in which case back to QoI). 
It's also not like an "unrolled" form: shape_cast is a "metadata" change and we likely don't even want to translate into thinking of "loops" because we don't need to.

To some extent, this edge case of a "unit dimension" reminds me of how `tensor<f32>` is modeling a scalar and we had very annoying similar debate about `f32` vs `tensor<f32>` (because special casing is annoying for everyone).

https://github.com/llvm/llvm-project/pull/72105