[all-commits] [llvm/llvm-project] 46bd65: [mlir][LinAlg] Vectorize reverse-like ops using ve...

Wed Feb 28 09:45:21 PST 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 46bd65a0509fefdf50d44f63640a7b8bd8ad0d14
      https://github.com/llvm/llvm-project/commit/46bd65a0509fefdf50d44f63640a7b8bd8ad0d14
  Author: Han-Chung Wang <hanhan0912 at gmail.com>
  Date:   2024-02-28 (Wed, 28 Feb 2024)

  Changed paths:
    M mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
    M mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir

  Log Message:
  -----------
  [mlir][LinAlg] Vectorize reverse-like ops using vector.gather ops. (#83205)

The reverse op is treated as a VectorMemoryAccessKind::Contiguous load.
It is contiguous slice, but we'll need to compute indices differently
and apply a reverse at vector level. It takes non-trivial efforts for
the approach. The revision flips the case to use vector.gather.
Otherwise there are functionality issues. E.g., the below example loaded
`2, 3, 4` (which is a bug), but what we want is `2, 1, 0`.

Before vectorization:

```mlir
func.func @vectorize_reverse_like_tensor_extract(%arg0: tensor<1x2x3xf32>, %arg1: tensor<1x1x3xf32>, %arg2: index) -> tensor<1x1x3xf32> {
  %c1 = arith.constant 1 : index
  %c0 = arith.constant 0 : index
  %c2 = arith.constant 2 : index
  %0 = linalg.generic {indexing_maps = [#map], iterator_types = ["parallel", "parallel", "parallel"]} outs(%arg1 : tensor<1x1x3xf32>) {
  ^bb0(%out: f32):
    %1 = linalg.index 1 : index
    %2 = linalg.index 0 : index
    %3 = affine.apply #map1(%1, %2, %arg2)
    %4 = linalg.index 2 : index
    %5 = arith.subi %c2, %4 : index
    %extracted = tensor.extract %arg0[%c0, %3, %5] : tensor<1x2x3xf32>
    linalg.yield %extracted : f32
  } -> tensor<1x1x3xf32>
  return %0 : tensor<1x1x3xf32>
}
```

Partial IR after vectorization:

```
  %5 = vector.constant_mask [1, 1, 3] : vector<1x1x4xi1>
  %6 = vector.broadcast %arg0 : index to vector<1x1x4xindex>
  %7 = vector.shape_cast %6 : vector<1x1x4xindex> to vector<4xindex>
  %8 = vector.extractelement %7[%c0_i32 : i32] : vector<4xindex>
  %9 = vector.transfer_read %3[%c0, %8, %c2], %cst, %5 {in_bounds = [true, true, true]} : tensor<1x2x3xf32>, vector<1x1x4xf32>
```

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications