[Mlir-commits] [mlir] [mlir][vector] Add special lowering for 2D transpose on 1D broadcast (PR #150562)

Wed Aug 6 11:01:27 PDT 2025

================
@@ -365,3 +365,66 @@ module attributes {transform.with_named_sequence} {
     transform.yield
   }
 }
+
+// -----
+
+// CHECK-LABEL:   func.func @transpose_of_broadcast(
+// CHECK-SAME:      %[[ARG0:.*]]: vector<2xf32>) -> vector<2x32xf32> {
+// CHECK:           %[[VAL_0:.*]] = arith.constant dense<0.000000e+00> : vector<2x32xf32>
+// CHECK:           %[[VAL_1:.*]] = vector.shuffle %[[ARG0]], %[[ARG0]] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] : vector<2xf32>, vector<2xf32>
+// CHECK:           %[[VAL_2:.*]] = vector.insert %[[VAL_1]], %[[VAL_0]] [0] : vector<32xf32> into vector<2x32xf32>
+// CHECK:           %[[VAL_3:.*]] = vector.shuffle %[[ARG0]], %[[ARG0]] [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] : vector<2xf32>, vector<2xf32>
+// CHECK:           %[[VAL_4:.*]] = vector.insert %[[VAL_3]], %[[VAL_2]] [1] : vector<32xf32> into vector<2x32xf32>
+// CHECK:           return %[[VAL_4]] : vector<2x32xf32>
+// CHECK:         }
+func.func @transpose_of_broadcast(%arg0 : vector<2xf32>) -> vector<2x32xf32> {
----------------
mshockwave wrote:

> Is the general goodness here to move the broadcast as late as possible, so that as little IR as possible uses the "big" tensor?

The original motivation of putting this in transpose lowering rather than a canonicalization pattern was simply because `vector<2xf32>` is not directly broadcastable to `vector<2x32xf32>` and I thought lowering to shufflevector is the only way -- at that time I didn't think of using shape_cast. Now I think a better way, which I'm working on right now, is teaching one of the canonicalization patterns you wrote earlier this year, `FoldTransposeBroadcast`, to use shape_cast.

https://github.com/llvm/llvm-project/pull/150562