[all-commits] [llvm/llvm-project] 0267d2: [mlir][linalg] Enable scalable vectorization of li...

Thu Jul 17 05:04:58 PDT 2025

  Branch: refs/heads/users/banach-space/linalg/unpack_vec_update
  Home:   https://github.com/llvm/llvm-project
  Commit: 0267d2a8519b481c88beac534e44df13ed17799d
      https://github.com/llvm/llvm-project/commit/0267d2a8519b481c88beac534e44df13ed17799d
  Author: Andrzej Warzynski <andrzej.warzynski at arm.com>
  Date:   2025-07-17 (Thu, 17 Jul 2025)

  Changed paths:
    M mlir/include/mlir/Dialect/Vector/Utils/VectorUtils.h
    M mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
    M mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
    M mlir/test/Dialect/Linalg/vectorization/linalg-ops.mlir

  Log Message:
  -----------
  [mlir][linalg] Enable scalable vectorization of linalg.unpack (WIP)

This patch updates `vectorizeAsTensorUnpackOp` to support scalable
vectorization by requiring user-specified vector sizes for both the
_read_ and _write_ operations involved in `linalg.unpack`. Detailed
rationale and an example are provided below.

Conceptually, `linalg.unpack` consists of the following high-level steps:
  1. _Read_ from the source tensor.
  2. Transpose the value read in step (1).
  3. _Write_ the value from step (2) into the destination tensor.

Currently, when vectorizing with user-provided vector sizes, only the
sizes for the _write_ operation (step 3) are required. Sizes for the
_read_ operation (step 1) are inferred from static shapes and inner tile
sizes.

This logic breaks when the input shapes or tile sizes are dynamic
(indeed, `vectorizeUnPackOpPrecondition` rejects such cases ATM and the
vectorization fails). This patch addresses the issue by requiring
explicit vector sizes for both the read and write sides, enabling
scalable vectorization in such cases.

Example:

```mlir
func.func @unpack(%in: tensor<1x1x8x?xf32>, %out: tensor<8x?xf32>) -> tensor<8x?xf32> {
  %vs = vector.vscale
  %c8 = arith.constant 8 : index
  %tile_size = arith.muli %vs, %c8 : index

  %unpack = linalg.unpack  %in
    inner_dims_pos = [0, 1]
    inner_tiles = [8, %tile_size]
    into %out : tensor<1x1x8x?xf32> -> tensor<8x?xf32>
  return %unpack : tensor<8x?xf32>
}

module attributes {transform.with_named_sequence} {
  transform.named_sequence @__transform_main(%arg0: !transform.any_op {transform.readonly}) {
    %0 = transform.structured.match ops{["linalg.unpack"]} in %arg0 : (!transform.any_op) -> !transform.any_op
    transform.structured.vectorize %0 vector_sizes [1, 1, 8, [8],  8, [8]] : !transform.any_op
    //                                              \         /    \    /
    //                                              read-sizes   write-sizes
    transform.yield
  }
}
```

Finally, this patch also extends `createReadOrMaskedRead` and
`createWriteOrMaskedWrite` to take scalable flags.

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications