[Mlir-commits] [mlir] [MLIR][XeGPU] Matrix load/store subgroup distribution (PR #165008)

Artem Kroviakov llvmlistbot at llvm.org
Fri Oct 31 02:31:15 PDT 2025


================
@@ -191,11 +191,21 @@ IsValidMatrixOpParams(VectorType dataTy, MemDescType mdescTy,
 
   ArrayRef<int64_t> dataShape = dataTy.getShape();
   ArrayRef<int64_t> mdescShape = mdescTy.getShape();
-
+  if (subgroup_block_io && layout) {
+    auto laneData = layout.getEffectiveLaneDataAsInt();
+    if (!laneData.empty()) {
+      bool isLaneDataLinear =
+          std::all_of(laneData.begin(), std::prev(laneData.end()),
+                      [](int x) { return x == 1; });
+      if (!isLaneDataLinear)
+        return emitError()
+               << "With subgroup_block_io, lane data must be linear.";
+      if (isLaneDataLinear && laneData.back() != 1)
----------------
akroviakov wrote:

This is a `subgroup_block_io`-specific check, not a general layout verification.

> First of all, why isLaneDataLinear needed here?

>From the xevm block op:

>  
>      ptr[ SubgroupLocalInvocationId ]
>    
>    and the second value is written to:
>    
>      ptr[ SubgroupLocalInvocationId + SubgroupMaxSize ]

It appears that if a lane accesses more than one element, these elements are sg-size strided, like accessing one element per distribution unit, hence the check for lane data being uniform 1 (it does not allow any vnni-style [2,1], nor does it allow loading adjacent elements like [1,2].
Which is aligned with the upstream lane data description

> lane_data : Specifies the shape of the tensor fragment that each lane accesses. It defines a single, minimal distribution unit. Processing the entire tensor may require one or more distribution units per hardware instruction.

 Perhaps `isLaneDataLinear` is not the best name. 


https://github.com/llvm/llvm-project/pull/165008


More information about the Mlir-commits mailing list