[Mlir-commits] [mlir] [MLIR][XeGPU] Matrix load/store subgroup distribution (PR #165008)
Artem Kroviakov
llvmlistbot at llvm.org
Fri Oct 31 02:31:15 PDT 2025
================
@@ -191,11 +191,21 @@ IsValidMatrixOpParams(VectorType dataTy, MemDescType mdescTy,
ArrayRef<int64_t> dataShape = dataTy.getShape();
ArrayRef<int64_t> mdescShape = mdescTy.getShape();
-
+ if (subgroup_block_io && layout) {
+ auto laneData = layout.getEffectiveLaneDataAsInt();
+ if (!laneData.empty()) {
+ bool isLaneDataLinear =
+ std::all_of(laneData.begin(), std::prev(laneData.end()),
+ [](int x) { return x == 1; });
+ if (!isLaneDataLinear)
+ return emitError()
+ << "With subgroup_block_io, lane data must be linear.";
+ if (isLaneDataLinear && laneData.back() != 1)
----------------
akroviakov wrote:
This is a `subgroup_block_io`-specific check, not a general layout verification.
> First of all, why isLaneDataLinear needed here?
>From the xevm block op:
>
> ptr[ SubgroupLocalInvocationId ]
>
> and the second value is written to:
>
> ptr[ SubgroupLocalInvocationId + SubgroupMaxSize ]
It appears that if a lane accesses more than one element, these elements are sg-size strided, like accessing one element per distribution unit, hence the check for lane data being uniform 1 (it does not allow any vnni-style [2,1], nor does it allow loading adjacent elements like [1,2].
Which is aligned with the upstream lane data description
> lane_data : Specifies the shape of the tensor fragment that each lane accesses. It defines a single, minimal distribution unit. Processing the entire tensor may require one or more distribution units per hardware instruction.
Perhaps `isLaneDataLinear` is not the best name.
https://github.com/llvm/llvm-project/pull/165008
More information about the Mlir-commits
mailing list