[Mlir-commits] [mlir] [MLIR][XeGPU] Allow some nd ops to have argument shapes mismatch for … (PR #120566)

Fri Dec 20 07:56:41 PST 2024

Jianhui-Li wrote:

I would not rush for option 4 also.  If I understand correctly, the subview op needs to compute the offsets from id, and then create the subview with sizes, offsets, strides.  I like the approach keeping the original tensor descriptor as a whole, but it requires adding a subview op which doesn't look like other XeGPU OPs (mapping to concrete hardware operations).  I don't know what benefit it can bring at this point other than the IR appears less "confusing", which is a debatable point. 

To me, whether the IR is "confusing" actually depends on how the IR is lowered or optimized. My view is actually reverse. The type mismatch doesn't bother me that much. But if the IR doesn't model the hardware behavior, say it introduces per-lane offsets/sizes computation which we don't need during the lowering, it causes a different type of confusion that bothers me more.  The passes on XeGPU is mostly target-specific, so people likes to match the IR with what hardware behavior - each lane takes the whole shape and read back its own data fragments implicitly, instead of computing its own offsets/sizes.  If transformation/optimization needs to know the data fragment distribution, they can refer to sg_map that was designed to explicitly describe the data distribution. 

I suggest we first go with option 2.  When it become clear that we really need a subview op, we can revisit it. 

https://github.com/llvm/llvm-project/pull/120566