[Mlir-commits] [mlir] [mlir][xegpu] SIMT distribution patterns for XeGPU CreateNdTdesc, LoadNd, StoreNd and Dpas Ops. (PR #135271)

Tue Apr 29 11:06:40 PDT 2025

charithaintc wrote:

> > Hi @fschlimb, this is Charitha from the IMEX team. We have the initial part of the XeGPU subgroup - SIMT distribution work ready for review. If you are interested and have the bandwidth, please have a look and give us feedback/approval. Thanks!
> 
> Thanks for getting in touch, interesting!
> 
> I added a few comments. This is not an area that I typically work in, so they are mostly on monkey level.
> 
> While distribution is different in this context than in the context of distributed memory, there a commonalities (similar to similarities in tiling and sharding). It seems to me that tiling/vector-distribution are special cases of general sharding/spmdization. Unification might be worth considering.
> 

Hi Frank, thanks very much for the review. I tried to address everything as much as I can. please take a look. 

> Out of curiosity, in addition to the question about the forward propagation, I wonder how tensor shapes that do not evenly divide to lane-sizes would be treated.

Good question. For the operators handled in this PR, we expect perfect distribution. If the high-level work group level computation does not map to lane sizes evenly, it is the responsibility of work group to subgroup or subsequent optimizations to ensure perfect distribution at SIMT level. I think this will be done using masking gather/scatter type loads. 

Also we check this requirement during the lowering. If the vector shape is not distributable pass will report that and fail. 

https://github.com/llvm/llvm-project/pull/135271