[all-commits] [llvm/llvm-project] 7ebfbf: [mlir][tensor] Update `GeneralizeOuterUnitDimsPack...
Andrzej Warzyński via All-commits
all-commits at lists.llvm.org
Tue Nov 12 11:10:47 PST 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 7ebfbf9c87941315d7c9ca84d1b22acf2a5bd14d
https://github.com/llvm/llvm-project/commit/7ebfbf9c87941315d7c9ca84d1b22acf2a5bd14d
Author: Andrzej Warzyński <andrzej.warzynski at arm.com>
Date: 2024-11-12 (Tue, 12 Nov 2024)
Changed paths:
M mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h
M mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp
M mlir/test/Dialect/Linalg/generalize-tensor-pack-tile.mlir
M mlir/test/Dialect/Linalg/generalize-tensor-pack.mlir
Log Message:
-----------
[mlir][tensor] Update `GeneralizeOuterUnitDimsPackOpPattern` (#115312)
Avoid generating spurious tensor.extract_slice, follow-on for #114315.
This is best to demonstrate with an example. Here's input for
`GeneralizeOuterUnitDimsPackOpPattern`:
```mlir
%pack = tensor.pack %input
padding_value(%pad : f32)
inner_dims_pos = [1, 0]
inner_tiles = [2, %tile_dim_1]
into %output : tensor<5x1xf32> -> tensor<1x1x2x?xf32>
```
Output _before_:
```mlir
%padded = tensor.pad %arg0 low[0, 0] high[%0, 1] {
^bb0(%arg4: index, %arg5: index):
tensor.yield %arg2 : f32
} : tensor<5x1xf32> to tensor<?x2xf32>
// NOTE: skipped in the output _after_
%extracted_slice = tensor.extract_slice
%padded[0, 0] [%arg3, 2] [1, 1] :
tensor<?x2xf32> to tensor<?x2xf32>
%empty = tensor.empty(%arg3) : tensor<2x?xf32>
%transposed = linalg.transpose
ins(%extracted_slice : tensor<?x2xf32>)
outs(%empty : tensor<2x?xf32>)
permutation = [1, 0]
%inserted_slice = tensor.insert_slice %transposed=
into %arg1[0, 0, 0, 0] [1, 1, 2, %arg3] [1, 1, 1, 1] :
tensor<2x?xf32> into tensor<1x1x2x?xf32>
```
Output _after_:
```mlir
%padded = tensor.pad %arg0 low[0, 0] high[%0, 1] {
^bb0(%arg4: index, %arg5: index):
tensor.yield %arg2 : f32
} : tensor<5x1xf32> to tensor<?x2xf32>
%empty = tensor.empty(%arg3) : tensor<2x?xf32>
%transposed = linalg.transpose
ins(%padded : tensor<?x2xf32>)
outs(%empty : tensor<2x?xf32>) permutation = [1, 0]
%inserted_slice = tensor.insert_slice %transposed
into %arg1[0, 0, 0, 0] [1, 1, 2, %arg3] [1, 1, 1, 1] :
tensor<2x?xf32> into tensor<1x1x2x?xf32>
```
This PR also adds a check to verify that only the last N trailing
dimensions are tiled (for some value of N). Based on the PR
discussion, this restriction seems reasonable - especially as there
are no in-tree tests requiring otherwise. For now, it also simplifies
the computation of permutations for linalg.transpose. This
restriction can be relaxed in the future if needed.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list