[Mlir-commits] [mlir] [mlir] [XeGPU] Add XeGPU workgroup to subgroup pass (PR #139477)

Fri May 16 14:05:25 PDT 2025

nbpatel wrote:

> A more general question about the design here.
> 
> If I read it correctly, the whole distribution logic is currently driven purely by `create_nd_tdesc` distribution and the other ops just follow. In general all these ops (and tests) more or less expect to be accompanied by `create_nd_tdesc` because we don't support distributing `tensor_desc` coming from, for example, function args. However, this falls apart for `dpas` that operates on vectors highlighting the limitation of this approach. The distribution right now does nothing if it is just `xegpu.dpas` op on its own.
> 
> Since each op is self contained thanks to the layout attr, I'd expect each op to be distributed individually and by the end of the conversion unrealized_casts will cancel out and/or necessary layout conversions would be materialized. The latter probably being a todo for later.

Yes, you are right, the current design expects that the IR uses all the xegpu ops and not any one op in isolation and hence the other ops just follow. This is because these are the cases we have come across so far. But I agree with you that we might hit the cases you mentioned and I will decouple the distribution logic in subsequent PR from create_nd op for all the ops that operate on vectors . I think for ops like load/store/update/prefetch we can safely assume there is a corresponding create_nd. I addressed all of your other comments. Thanks.

https://github.com/llvm/llvm-project/pull/139477