[all-commits] [llvm/llvm-project] 7ebfbf: [mlir][tensor] Update `GeneralizeOuterUnitDimsPack...

Tue Nov 12 11:10:47 PST 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 7ebfbf9c87941315d7c9ca84d1b22acf2a5bd14d
      https://github.com/llvm/llvm-project/commit/7ebfbf9c87941315d7c9ca84d1b22acf2a5bd14d
  Author: Andrzej Warzyński <andrzej.warzynski at arm.com>
  Date:   2024-11-12 (Tue, 12 Nov 2024)

  Changed paths:
    M mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h
    M mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp
    M mlir/test/Dialect/Linalg/generalize-tensor-pack-tile.mlir
    M mlir/test/Dialect/Linalg/generalize-tensor-pack.mlir

  Log Message:
  -----------
  [mlir][tensor] Update `GeneralizeOuterUnitDimsPackOpPattern` (#115312)

Avoid generating spurious tensor.extract_slice, follow-on for #114315.

This is best to demonstrate with an example. Here's input for
`GeneralizeOuterUnitDimsPackOpPattern`:
```mlir
%pack = tensor.pack %input
  padding_value(%pad : f32)
  inner_dims_pos = [1, 0]
  inner_tiles = [2, %tile_dim_1]
  into %output : tensor<5x1xf32> -> tensor<1x1x2x?xf32>
```

Output _before_:
```mlir
%padded = tensor.pad %arg0 low[0, 0] high[%0, 1] {
^bb0(%arg4: index, %arg5: index):
  tensor.yield %arg2 : f32
} : tensor<5x1xf32> to tensor<?x2xf32>
// NOTE: skipped in the output _after_
%extracted_slice = tensor.extract_slice
  %padded[0, 0] [%arg3, 2] [1, 1] :
  tensor<?x2xf32> to tensor<?x2xf32>
%empty = tensor.empty(%arg3) : tensor<2x?xf32>
%transposed = linalg.transpose
  ins(%extracted_slice : tensor<?x2xf32>)
  outs(%empty : tensor<2x?xf32>)
  permutation = [1, 0]
%inserted_slice = tensor.insert_slice %transposed=
  into %arg1[0, 0, 0, 0] [1, 1, 2, %arg3] [1, 1, 1, 1] :
  tensor<2x?xf32> into tensor<1x1x2x?xf32>
```

Output _after_:
```mlir
%padded = tensor.pad %arg0 low[0, 0] high[%0, 1] {
^bb0(%arg4: index, %arg5: index):
  tensor.yield %arg2 : f32
} : tensor<5x1xf32> to tensor<?x2xf32>
%empty = tensor.empty(%arg3) : tensor<2x?xf32>
%transposed = linalg.transpose
  ins(%padded : tensor<?x2xf32>)
  outs(%empty : tensor<2x?xf32>) permutation = [1, 0]
%inserted_slice = tensor.insert_slice %transposed
  into %arg1[0, 0, 0, 0] [1, 1, 2, %arg3] [1, 1, 1, 1] :
  tensor<2x?xf32> into tensor<1x1x2x?xf32>
```

This PR also adds a check to verify that only the last N trailing
dimensions are tiled (for some value of N). Based on the PR
discussion, this restriction seems reasonable - especially as there
are no in-tree tests requiring otherwise. For now, it also simplifies
the computation of permutations for linalg.transpose. This
restriction can be relaxed in the future if needed.

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications