[Mlir-commits] [mlir] [mlir][vector] Add support for multi-dim reduction vector distribution (PR #71193)
Lei Zhang
llvmlistbot at llvm.org
Sat Nov 4 11:11:44 PDT 2023
================
@@ -494,6 +494,62 @@ func.func @warp_scf_for_multiple_yield(%arg0: index, %arg1: memref<?xf32>, %arg2
// -----
+// CHECK-PROP-LABEL: func @warp_scf_for_multi_reduce(
+// CHECK-PROP-NOT: vector.warp_execute_on_lane_0
+// CHECK-PROP: scf.for {{.*}} -> (vector<1x4xf32>) {
+// CHECK-PROP: scf.for {{.*}} -> (vector<1x4xf32>) {
+// CHECK-PROP: vector.transfer_read {{.*}} : memref<2x32x40x384xf32>, vector<1x4xf32>
+// CHECK-PROP: }
+// CHECK-PROP: }
+// CHECK-PROP: vector.reduction <add>
+// CHECK-PROP: gpu.shuffle
+#map = affine_map<(d0, d1) -> (0, 0)>
+func.func @warp_scf_for_multi_reduce(%arg0: memref<2x32x40x384xf32>, %arg1: memref<2x32x40x384xf16>, %arg2: memref<2x32xf32>, %arg3: memref<2x32x40x384xf16>) {
+ %cst = arith.constant dense<1.536000e+04> : vector<8x128xf32>
+ %cst_0 = arith.constant dense<0.000000e+00> : vector<8x128xf32>
+ %cst_1 = arith.constant 9.99999997E-7 : f32
+ %c128 = arith.constant 128 : index
+ %c8 = arith.constant 8 : index
+ %c0 = arith.constant 0 : index
+ %c40 = arith.constant 40 : index
+ %c384 = arith.constant 384 : index
+ %cst_2 = arith.constant 0.000000e+00 : f16
+ %cst_3 = arith.constant 0.000000e+00 : f32
+ %0 = gpu.thread_id x
+ %1 = arith.truncf %cst_1 : f32 to f16
+ vector.warp_execute_on_lane_0(%0)[256] {
----------------
antiagainst wrote:
This example serves as an integrated test (in the sense we test multiple reductions + scf.for moving out of the warp op). The shape here is a nice match. Can we also add another small test where we only check vector reduction but with a shape that is more complicated? Like maybe warp size = 256 and vector<128x4x64> -> vector<32x1x16> or something. Basically to check that the affine map for controlling distribution order better.
https://github.com/llvm/llvm-project/pull/71193
More information about the Mlir-commits
mailing list