[all-commits] [llvm/llvm-project] 03c2f5: [mlir][linalg][conv] Flatten the channel dimension...
Andrzej Warzyński via All-commits
all-commits at lists.llvm.org
Wed Dec 6 13:35:16 PST 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 03c2f5d8bbcf31239a631d9343ac7f4b6b3094c1
https://github.com/llvm/llvm-project/commit/03c2f5d8bbcf31239a631d9343ac7f4b6b3094c1
Author: Andrzej Warzyński <andrzej.warzynski at arm.com>
Date: 2023-12-06 (Wed, 06 Dec 2023)
Changed paths:
M mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td
M mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h
M mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
M mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
A mlir/test/Dialect/Linalg/vectorize-convolution-flatten.mlir
Log Message:
-----------
[mlir][linalg][conv] Flatten the channel dimension when vectorizing (#71918)
The current vectorization of 1D depthwise convolutions in Linalg is
_sub-optimal_ for tensor with a low number of channel dimensions, e.g.:
```mlir
linalg.depthwise_conv_1d_nwc_wc
{dilations = dense<1> : vector<1xi64>,
strides = dense<1> : vector<1xi64>}
ins(%input, %filter : tensor<1x8x3xi8>, tensor<1x3xi8>)
outs(%output : tensor<1x8x3xi8>) -> tensor<1x8x3xi8>
```
That's due to the fact that ultimately (i.e. at LLVM level),
vectorization happens along the trailing dimension (i.e. the channel
dimension). In this case it leads to vectors with 3 elements (or worse,
if there's e.g. only 1 channel dimension). For comparison, a 128 bit
wide vector registers can hold 16 x i8.
Instead, this patch adds an option to flatten/collapse the channel
dimension into the width dimension of the input/filter/output using
`vector.shape_cast` operation:
```mlir
%sc_input = vector.shape_cast %input : vector<1x8x3xi8> to vector<1x24xi8>
%sc_output = vector.shape_cast %output : vector<1x8x3xi8> to vector<1x24xi8>
%b_filter = vector.broadcast %filter : vector<3xi8> to vector<1x8x3xi8>
%sc_filter = vector.shape_cast %b_filter : vector<1x8x3xi8> to vector<1x24xi8>
```
This new vectorization mode is implemented in `depthwiseConv` by
inserting `vector.shape_cast` Ops before and after
`depthwiseConv1dSliceAsMulAcc` is invoked. It can be selected through
e.g. a transform dialect attribute:
```mlir
transform.structured.vectorize_children_and_apply_patterns %conv {flatten_1d_depthwise_conv}
```
A forthcoming patch will implement a strategy to automatically switch
between the two implementations, depending on the shape of the input
tensors.
Co-authored by: Bradley Smith <bradley.smith at arm.com>
More information about the All-commits
mailing list