[Mlir-commits] [mlir] [MLIR][XeGPU] Allow some nd ops to have argument shapes mismatch for … (PR #120566)

Thu Dec 19 08:28:02 PST 2024

adam-smnk wrote:

Just first impressions.
I feel that option 2 is quite unintuitive due to this mix of abstractions. Memory descriptors operate at the workgroup level while individual operation implicitly map to threads. At the same time, it is equally undesirable to lose information which seems to happen in option 1.
Option 3 is an alternative design choice but it's just moving complexity around.

Ideally, all `sg_map`s should be consumed at the time of distribution to have clearer separation of abstraction layers where you can run dedicated transformations aiming at either SIMD or SIMT.

Currently I'm mostly leaning toward the 4th idea. Subview of a descriptor could work or maybe the load operation itself could contain some optional offsets, e.g. mapping it to thread ID.
Ideally, it would be in a format that allows to easily identify whether IR is pre- or post-distribution.

https://github.com/llvm/llvm-project/pull/120566