[Mlir-commits] [mlir] [mlir][xegpu] Convert Vector contraction to XeGPU (PR #122115)

Wed Mar 12 10:44:28 PDT 2025

================
@@ -0,0 +1,158 @@
+// RUN: mlir-opt %s -convert-vector-to-xegpu -split-input-file | FileCheck %s
+
+#map = affine_map<(d0, d1, d2) -> (d0, d2)>
+#map1 = affine_map<(d0, d1, d2) -> (d2, d1)>
+#map2 = affine_map<(d0, d1, d2) -> (d0, d1)>
+func.func @dpas_gemm_f16(%lhs: vector<8x16xf16>, %rhs: vector<16x16xf16>,
+    %acc: vector<8x16xf32>) -> vector<8x16xf32> {
+  %3 = vector.contract
+    {indexing_maps = [#map, #map1, #map2],
+    iterator_types = ["parallel", "parallel", "reduction"],
+    kind = #vector.kind<add>} %lhs, %rhs, %acc
+    : vector<8x16xf16>, vector<16x16xf16> into vector<8x16xf32>
+  return %3 : vector<8x16xf32>
+}
+
+// CHECK-LABEL: @dpas_gemm_f16(
+// CHECK-SAME:  %[[LHS:.+]]: vector<8x16xf16>,
+// CHECK-SAME:  %[[RHS:.+]]: vector<16x16xf16>,
+// CHECK-SAME:  %[[ACC:.+]]: vector<8x16xf32>
+// CHECK:       %[[DPAS:.+]] = xegpu.dpas
+// CHECK-SAME:    %[[LHS]], %[[RHS]], %[[ACC]]
+// CHECK-SAME:    {{.*}}-> vector<8x16xf32>
+// CHECK:       return %[[DPAS]]
+
+// -----
+
+#map = affine_map<(d0, d1, d2) -> (d0, d2)>
+#map1 = affine_map<(d0, d1, d2) -> (d2, d1)>
+#map2 = affine_map<(d0, d1, d2) -> (d0, d1)>
+func.func @dpas_gemm_i8(%lhs: vector<8x32xi8>, %rhs: vector<32x16xi8>,
+    %acc: vector<8x16xi32>) -> vector<8x16xi32> {
+  %3 = vector.contract
+    {indexing_maps = [#map, #map1, #map2],
+    iterator_types = ["parallel", "parallel", "reduction"],
+    kind = #vector.kind<add>} %lhs, %rhs, %acc
+    : vector<8x32xi8>, vector<32x16xi8> into vector<8x16xi32>
+  return %3 : vector<8x16xi32>
+}
+
+// CHECK-LABEL: @dpas_gemm_i8(
+// CHECK-SAME:  %[[LHS:.+]]: vector<8x32xi8>,
+// CHECK-SAME:  %[[RHS:.+]]: vector<32x16xi8>,
+// CHECK-SAME:  %[[ACC:.+]]: vector<8x16xi32>
+// CHECK:       %[[DPAS:.+]] = xegpu.dpas
+// CHECK-SAME:    %[[LHS]], %[[RHS]], %[[ACC]]
+// CHECK-SAME:    {{.*}}-> vector<8x16xi32>
+// CHECK:       return %[[DPAS]]
+
+// -----
+
+// For simplicity, only plain data layouts are currently supported.
+// VNNI packing is applied later as a separate lowering step.
+
+#map = affine_map<(d0, d1, d2, d3) -> (d0, d2, d3)>
+#map1 = affine_map<(d0, d1, d2, d3) -> (d2, d1, d3)>
+#map2 = affine_map<(d0, d1, d2, d3) -> (d0, d1)>
+func.func @negative_vnni_packed(%lhs: vector<8x8x2xf16>, %rhs: vector<8x16x2xf16>,
----------------
adam-smnk wrote:

I restricted lowering to only accept plain 2D layout. As you mentioned, it's better to leave it to later XeGPU ops lowering.

This test case checks that VNNI is rejected. Overall, it checks that higher dim contract isn't matched and highlights that VNNI case is not supported which one could expect given DPAS accepts 3D input.

https://github.com/llvm/llvm-project/pull/122115