[Mlir-commits] [mlir] [MLIR][XeGPU] Introduce `xegpu::uArch` usage in target-sensitive passes (PR #163801)

Fri Oct 17 01:40:06 PDT 2025

================
@@ -557,23 +599,54 @@ void LayoutInfoPropagation::visitDpasOp(
     ArrayRef<const LayoutInfoLattice *> results) {
   VectorType aTy = dpas.getLhsType();
   VectorType bTy = dpas.getRhsType();
-  propagateIfChanged(
-      operands[0], operands[0]->meet(getSIMTLayoutInfoForDPASOperand(aTy, 0)));
-  propagateIfChanged(
-      operands[1], operands[1]->meet(getSIMTLayoutInfoForDPASOperand(bTy, 1)));
+
+  auto uArch = getUArch(getChipStr(dpas).value_or(""));
+  const int subgroupSize = uArch->getSubgroupSize();
+  auto uArchInstruction =
+      std::static_pointer_cast<xegpu::uArch::DPASInstruction>(
+          uArch->getInstruction(xegpu::uArch::InstructionKind::DPAS));
+  const int maxALen =
+      uArchInstruction->getSupportedM(aTy.getElementType()).back();
+  const int maxBLen =
+      uArchInstruction->getSupportedK(bTy.getElementType()).back();
+  SmallVector<int> instDataA = {maxALen, subgroupSize};
----------------
akroviakov wrote:

Will add a check. 

For the future, should this happen as a verification of the user shape (effectively `sg_data`), or as part of the `max*Len` selection? 
A user-supplied shape can have a dimension of 12 for an instruction that supports sizes `[1,2,4,8]`. Using the maximum size 8 fails, but using 4 succeeds (4+4+4). We might also do 8+4, but I suppose then `inst_data` step needs to be fused with the blocking and not surface in the pre-blocking IR. Were there any plans in this direction, or do we only work with multiples of the max size in the foreseeable future?

https://github.com/llvm/llvm-project/pull/163801