[Mlir-commits] [mlir] [mlir][vector-to-gpu]: Extend MMA Lowerings (PR #176785)

Mon Feb 2 03:35:41 PST 2026

================
@@ -130,7 +147,13 @@ static std::optional<int64_t> getStaticallyKnownRowStride(ShapedType type) {
   if (failed(memrefType.getStridesAndOffset(strides, offset)) ||
       strides.back() != 1)
     return std::nullopt;
-  int64_t stride = strides[strides.size() - 2];
+
+  int stridePostion = strides.size() - 2;
+  if (!permutationMap.isPermutation()) {
+    if (auto outerResult = dyn_cast<AffineDimExpr>(permutationMap.getResult(0)))
----------------
FranklandJack wrote:

@Hsiangkai 

On your second question:

> When permutationMap[0] is not affine dimension, you skip the updating of stridePostion instead of returning std::nullopt. Can you add a test case to show it is still correct even when permutationMap[0] is not affine dimension.

After experimenting locally, it seems `vector.transfer_read/write` require the permutation map being a "projected permutation", I tried `(d0, d1) -> (d0 mod 4, d1)` and `(d0, d1) -> (d0 + 4, d1)` and they both gave the error message:
```
error: 'vector.transfer_read' op requires a projected permutation_map (at most one dim or the zero constant can appear in each result)
```

Which I think means the only possibility for a non affine dim expr is if the zero constant appears in the map e.g. `(d0, d1) -> (0, d1)` and this is still valid since we will get the stride for the row with the logic as is. This is already tested with an existing lit test [here](https://github.com/llvm/llvm-project/pull/176785/changes#diff-1eda326675190db858d6e06fae7c3d9ac470ab99c522ade8a67d9a715ff3a25cR166).

https://github.com/llvm/llvm-project/pull/176785