[Mlir-commits] [mlir] [MLIR] Folding unpack and pack sequence in data layout propagation from padded domain (PR #138332)
Han-Chung Wang
llvmlistbot at llvm.org
Mon May 5 14:24:16 PDT 2025
================
@@ -1421,3 +1395,48 @@ func.func @no_push_down_unpack_through_non_divisible_expand(%5: tensor<384x32x8x
// CHECK: %[[UNPACK:.+]] = linalg.unpack %[[ARG0]]
// CHECK: %[[EXPANDED:.+]] = tensor.expand_shape %[[UNPACK]] {{\[}}[0, 1], [2]] output_shape [256, 12, 256] : tensor<3072x256xf32> into tensor<256x12x256xf32>
// CHECK: return %[[EXPANDED]] : tensor<256x12x256xf32>
+
+// -----
+
+func.func @push_unpack_in_padded_domain_foldable(%arg0: tensor<8x8x4x8xf32>, %dest: tensor<?x64xf32>, %arg1: tensor<?x64xbf16>) -> tensor<?x64xbf16> {
+ %unpack = linalg.unpack %arg0 inner_dims_pos = [0, 1] inner_tiles = [4, 8] into %dest : tensor<8x8x4x8xf32> -> tensor<?x64xf32>
+ %0 = linalg.generic {indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0, d1)>], iterator_types = ["parallel", "parallel"]} ins(%unpack : tensor<?x64xf32>) outs(%arg1 : tensor<?x64xbf16>) {
+ ^bb0(%in: f32, %out: bf16):
+ %1 = arith.truncf %in : f32 to bf16
+ linalg.yield %1 : bf16
+ } -> tensor<?x64xbf16>
+ return %0 : tensor<?x64xbf16>
+}
+
+// CHECK-LABEL: func.func @push_unpack_in_padded_domain_foldable
+// CHECK-SAME: %[[ARG0:[a-zA-Z0-9]+]]
+// CHECK: %[[EMPTY:.+]] = tensor.empty
+// CHECK: %[[GENERIC:.+]] = linalg.generic
+// CHECK-SAME: ins(%[[ARG0]] : tensor<8x8x4x8xf32>)
+// CHECK-SAME: outs(%[[EMPTY]] : tensor<?x8x4x8xbf16>)
+// CHECK: %[[UNPACK:.+]] = linalg.unpack %[[GENERIC]]
----------------
hanhanW wrote:
Should we add a CHECK-SAME for destination, which makes sure that one of generic outs is reused by it? E.g.,
```
CHECK-SAME: into %[[ARG2]]
```
(same comment for the below test)
https://github.com/llvm/llvm-project/pull/138332
More information about the Mlir-commits
mailing list